or, an application of statistical process control methods and multiple regression analysis to the Central England Temperature monthly records, 1850 to 2017, with a postscript regarding the ‘Beast from the East’, February 2018.

{Note – text in **bold** are hyperlinks, either to data in this study, or to other references. Text in* italics* are quoted from other sources, with reference links near to its use in **bold**}.

Introduction

When two Brits who have never met before the chances are they will start up a conversation beginning with the weather. That’s because the UK, being surrounded by the warm Atlantic to the south and west, and the cold North Sea to the north and east also lies at about the same latitude where the polar jet runs. This river of air can dominate the pattern of weather that we see. There is a lot of weather to live through if you are a Brit.

The annual variation of solar insolation due to the earth’s orbit drives the heat we receive through the seasons, and this is the monthly irradiance the **UK receives on average**.

This is the principle reason for the monthly temperature changes detailed below, which will vary according to location. Generally, the temperature lags the radiative forcing by about **four to six weeks**, so peak temperature is mid July to early August, after the summer solstice (maximum insolation), and the low temperature is Mid January to early February, after the winter solstice (minimum insolation).

For Britain, the further north you go, the less solar insolation you receive, an ineluctable consequence of the Earth’s orbit and tilt about the sun, since the maximum solar elevation angle at the midday zenith in the Northern hemisphere is related to latitude by being equal to {90º – latitudeº + solar declinationº}. The solar declination follows the **analemma **which tracks the declination changes with date between +23.5º (midsummer solstice) and -23.5º (midwinter solstice) through the year.

Thus with a latitude spanning from 50 to 60º the midsummer/midwinter maximum solar elevation angles in the UK are 64º/17º (at 50º) and 54º/7º (at 60º). This difference in maximum solar elevation alters the insolation, thus the annual irradiance in the UK changes with latitude as below.

Another consequence of living further north in the UK is the greater likelihood the **jetstream** will give you colder northerly or easterly winds. This is unavoidable given the UK spans the latitudes which as the following **diagram** suggests is covered by the likely variation in the position of the polar jet.

The following reasons given by **O’Toole **suggest why this matters

*Summary of the Jet Stream and the weather it creates:*

*The position of the jet stream over the UK determines the type of weather we experience.**If the polar front jet is situated significantly to the south of the UK we will experience colder than average weather.**If the polar front jet is situated to the north of the UK we will experience warmer than average weather.**If the polar front jet is situated over the UK we will experience wetter and windier than average weather.**If the polar front jet has a large amplification then cold air will travel further south than average and warm air will travel further north than average.**The direction and angle of the jet stream arriving at the UK will determine what source of air (i.e. cold, dry, warm, wet, from maritime or continental sources) the UK experiences.*

Thus the drivers of the polar jet are indirectly part of British life and the cause of much weather and conversation.

The Central England Temperature record, back as far as 1850 offers a monthly view of British weather at the latitudes measured. This study seeks to understand the trends in the monthly CET over this period using statistical process control methods. This study also seeks to understand the data since 1950 using multiple regression analysis based on best subsets regression, which is the year when teleconnections data became available on a monthly basis, together with annual CO* _{2 }*levels, and the monthly Atlantic Multidecadal Oscillation.

The teleconnection data and the other two variables are described more fully in Appendix 1, together with the sources used for this analysis. Both the statistical methods employed are described in Appendix 2.

*The CET dataset is the longest instrumental record of temperature in the world. *

*‘These daily and monthly temperatures are representative of a roughly triangular area of the United Kingdom enclosed by Lancashire, London and Bristol. The present network of Rothamsted (51.80N, -0.35W), Pershore (52.15N, -2.04W) and Stonyhurst (53.85N, -2.45W) attempts to meet this need by covering the southeast, westsouthwest and north of the Central England area. *

*The monthly series, which begins in 1659, is the longest available instrumental record of temperature in the world. The daily mean-temperature series begins in 1772. Manley [1953, 1974] compiled most of the monthly series, covering 1659 to 1973. These data were updated to 1991 by Parker et al [1992], who also calculated the daily series. Both series are now kept up to date by the Climate Data Monitoring section of the Hadley Centre, Met Office*

The construction of the CET record by Manly [1953, 1974] was a task of some dedication, based on weather observation converted to temperature (with an accuracy of ± 1ºC), with the work continued by Parker [1992]. From 1659 to the present, changes in thermometry and location have resulted in improvements in accuracy, albeit with a caveat below. Nonetheless, in this study the monthly record is taken at face value.

To paraphrase **Svalgaard [2015]**, known variables which could effect the CET include:-

*1) Northern hemisphere teleconnections, influencing weather patterns.*

* 2) Changes in ocean circulation, AMO and others.*

* 3) Greenhouse gas emissions, notably CO _{2}.*

*4) Land use (cities, logging, crops, grazing…).*

*5) Volcanic aerosol emissions.*

*6) Earth orbital and orientation variations.*

*7) Solar Irradiance and activity.*

*8) Diverse unpredictable catastrophes.*

*9) Stochastic variations of a complex, non-linear system.*This study seeks to understand causes 1 to 3 based on readily available data, and their effect on the monthly CET. These are explored more fully in Appendix 1.

It is noted that cause 4 is accepted by the **Met Office** in the (HadCET) dataset.

*Since 1974 the data have been adjusted to allow for urban warming: currently a correction of -0.2 °C is applied to mean temperatures.*

To this extent, cause 4 is included in the analysis, and we are using their adjusted data. Further discussion of this effect on the CET is given **here**, and the general effect is a matter of public debate.

Cause 5 is more interesting – no data is used in this study to directly model volcanic aerosol emissions, and yet during the time scale (1850-2017), several significant eruptions with **VEI** = 6, have occurred. These were notably Krakatoa (1883), Santa María (1902), Novarupta (1906), Cerro Azul (1932), and more recently Mount Pinatubo (1992) . Such events have demonstrably reduced global temperatures for a few years (e.g. Alados-Arboledas [1997], Robock [2000]), and their effect will be in the CET record. Volcanism is in a sense a known but ‘hidden’ variable, as we would have to know the size, location and duration of all future eruptions to use this in any model. This is clearly an impossible demand, and poses a limit to being able to predict into the future with certainty. Over the time scale considered (1850 to 2017), causes 6 & 7 can be **safely discounted** (Svalgaard [2015], Lean [2010]). Variations in **total solar irradiance changes the annual CET by about ±0.13 °C** during typical solar cycles. Cause 8, such as meteor strikes can again be discounted in the time series, but accepted as another unpredictable variable with implications for future behaviour. Cause 9 is inherent in the data, and may be equal to the residual error from any model presented.

The yearly values (indicated by points connected by a dotted line) are shown in the graphs below, together with a time series analysis of the data, using **exponentially weighted moving averages (EWMA)** for the 10 year (red line) and 30 year (blue line). The 10 year (red line) is the decadal behaviour; the 30 year (blue line) is the multidecadal behaviour. Details of this procedure and supplementary analysis of trends by cusum methods are found in Appendix 2.

For the trend to continue to rise, the annual data have to be above both the 10 year (red line) and 30 year (blue line) data. Likewise for the trends to continue to fall, the annual data have to be below both the 10 year (red line) and 30 year (blue line) data. If the 30 year (blue line) data doesn’t vary outside a limit of ± 0.1ºC for a period of time, then the pattern is described as being stable.

The turning points in the data are identified by the methods described in Appendix 2 below. All statistical analysis were performed using **Minitab** v12, with the CET time series data replotted in Excel 2003.

The distribution of the data shows a **mean (standard deviation)** of 3.85 (1.78) ºC. The plot shows the likelihood of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (2.35) ºC. This sets the probability range for temperature changes year to year for this month. For example 1881 to 1882 had a temperature increase of 6.7ºC. 1962 to 1963 had a temperature decrease of -6.4ºC. Both of these are above the 1/100 year likelihood range (0.01, 0.99).

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the January data since 1850 are:-

Notice that the temperature trends show rising, stable and falling patterns, even after 1950 when CO* _{2}* levels started to climb. Overall in 167 years the 30 yr temperature trend has risen by 1.62ºC. The recent rise peaked in 2009.

Correlations to the teleconnections (**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are as follows.

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Jan CET (ºC) = 4.54 + 0.236 x AO + 0.727 x NAO + 0.214 x EA

R-Sq = 47.7%, standard error = 1.25ºC.

The January data is influenced by teleconnections, including both the Arctic and North Atlantic oscillations and the East Atlantic pattern. Inclusion of the other variables in the empirical model results in worse fit and more error, so these are discounted. About 48% of the data is explained by this model – the rest is due to stochastic behaviour.

The distribution of the data shows a **mean (standard deviation)** of 4.15 (1.92)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (2.75)ºC. This sets the probability range for temperature changes year to year for this month. For example 1946 to 1948 had a temperature decrease of -7.8ºC, followed by a temperature increase of 6.6ºC; both of these are above the 1/100 year likelihood range (0.01, 0.99).

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the February data since 1850 are:-

Again the temperature trends show rising, stable and falling patterns. Notice that the February temperature fell from 1929 until 1988, even after 1950 when CO* _{2}* levels started to climb rapidly. Overall, in 167 years, the 30 yr temperature trend has risen by 0.52ºC. The recent peak in the ten year data was reached in 2003, and shows signs of steady decline back to the 30 year trend line. This may signify a pause is developing.

Correlations to the teleconnections (**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are as follows.

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Feb CET (ºC) = 4.63 + 0.377 x AO + 0.351 x NAO + 0.577 x EA + 0.289 x EA/WR – 0.390 x SCA + 1.27 x AMO

R-Sq = 56.0%, standard error = 1.28ºC.

The February data is influenced by all the teleconnections and the AMO. Including CO* _{2}* in the empirical model results in worse fit and more error, so this is discounted. 56% of the data is explained by this model – the rest is due to stochastic behaviour. See appendix (3) for further discussion of the February record, including 2018.

The distribution of the data shows a **mean (standard deviation)** of 5.70 (1.49)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.02 (2.10)ºC. This sets the probability range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the March data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the March temperature appears to have plateaued since 2006. Overall, in 167 years, the 30 yr temperature trend has risen by 0.96ºC.

Least squares fit to the teleconnections (**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Mar CET (ºC) = 6.47 + 0.717 x NAO + 0.521 x EA – 0.482 x SCA

R-Sq = 49.2%, standard error = 1.11ºC.

The March data is influenced by the North Atlantic oscillations, and the East Atlantic and Scandinavian patterns. Including other variables in the empirical model results in worse fit and more error, so these are discounted. 49% of the data is explained by this model – the rest is due to stochastic behaviour.

The distribution of the data shows a **mean (standard deviation)** of 8.16 (1.14)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (1.49)ºC. This sets the probability range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the April data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the April temperature appears to have plateaued since 2012. Overall, in 167 years, the 30 yr temperature trend has risen by 0.65ºC.

Least squares fit to the teleconnections (**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

April CET (ºC) = 6.27 + 0.438 x AO + 0.549 x EA/WR + 0.00595 x CO* _{2}* + 1.26 x AMO

R-Sq = 46.3%, standard error = 0.84ºC.

Notice this is the first month where an effect from CO* _{2}* can be included in the best subsets regression. It implies if a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the April CET by 0.60ºC. Including other variables in the empirical model results in worse fit and more error, so these are discounted. 46% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The distribution of the data shows a **mean (standard deviation)** of 11.25 (1.17)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.01 (1.49)ºC. This sets the probability range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the May data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the May temperature appears to have plateaued since 2009. Overall, in 167 years, the 30 yr temperature trend has slightly risen by 0.13ºC.

Least squares fit to the teleconnections (**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

May CET (ºC) = 8.38 + 0.305 x NAO + 0.355 x EA/WR + 0.00913 x CO* _{2}* + 1.89 x AMO

R-Sq = 30.1%, standard error = 0.91ºC.

CO* _{2}* can be included in the best subsets regression. It implies if a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the May CET by 0.91ºC. Including other variables in the empirical model results in worse fit and more error, so these are discounted. 30% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The distribution of the data shows a **mean (standard deviation)** of 14.28 (1.01)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (1.43)ºC. This sets the probability range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the June data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the June temperature appears to have plateaued since 2008. Overall, in 167 years, the 30 yr temperature trend has fallen by 0.09ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Jun CET (C) = 11.5 + 0.762 x AO + 0.517 x EA/WR + 0.00790 x CO* _{2}* + 1.74 x AMO

R-Sq = 26.8%, standard error = 0.92ºC.

CO* _{2}* can be included in the best subsets regression. It implies if a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the June CET by 0.79ºC. Including other variables in the empirical model results in worse fit and more error, so these are discounted. 27% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The distribution of the data shows a **mean (standard deviation)** of 16.09 (1.22)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (1.74)ºC. This sets the likelihood range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the July data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the July temperature appears to slowly rising since the middle ’70’s. Overall, in 167 years, the 30 yr temperature trend has risen by 0.71ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Jul CET (ºC) = 9.32 + 0.953 x AO + 0.288 x NAO – 0.371 x EA + 0.434 x EA/WR + 0.139 x SCA + 0.0199 x CO* _{2}* + 1.62 x AMO

R-Sq = 41.8%, standard error = 0.97ºC.

In this instance, all terms are included for this month, including CO* _{2}*. Assuming a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the July CET by 1.99ºC. 42% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The distribution of the data shows a **mean (standard deviation)** of 15.73 (1.15)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (1.55)ºC. This sets the likelihood range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the August data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the August temperature appears to have stabilised since 2005, with the 10 year red trend approaching the 30 year blue trend. Overall, in 167 years, the 30 yr temperature trend has risen by 0.78ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

August CET (ºC) = 10.6 + 0.732 x AO + 0.226 x NAO – 0.404 x EA + 0.146 x EA/WR + 0.253 x SCA + 0.0152 x CO* _{2}* + 2.11 x AMO

R-Sq = 50.7%, standard error = 0.89ºC.

In this instance, all terms are included for this month, including CO* _{2}*. Assuming a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the August CET by 1.52ºC. 51% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The distribution of the data shows a **mean (standard deviation)** of 13.50 (1.10)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (1.55)ºC. This sets the likelihood range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the September data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the September temperature has been rising since 1997, although the rate changed since 2007. The 10 year red trend is slowly approaching the 30 year blue trend. Overall, in 167 years, the 30 yr temperature trend has risen by 0.89ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Sepember CET (ºC) = 11.6 + 0.245 x NAO + 0.441 x EA + 0.260 x EA/WR + 0.00606 x CO* _{2}* + 0.774 x AMO

R-Sq = 44.8%, standard error = 0.82ºC.

In this instance, all terms are included for this month, except the Scandinavian pattern. Assuming a linear (rather than logarithmic) effect of CO* _{2}* on warming exists, then increasing the annual level by 100 ppm will increase the September CET by 0.61ºC. 45% of the data is explained by this model – the rest is due to stochastic behaviour.

The distribution of the data shows a **mean (standard deviation)** of 10.05 (1.36)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.03 (1.64)ºC. This sets the likelihood range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the October data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the October temperature rose from 1920 until 1992, and started rising again in 2001. It currently shows no sign of slowing down. Overall, in 167 years, the 30 yr temperature trend has risen by 1.28ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Oct CET (C) = 7.49 + 0.269 x NAO + 0.605 x EA + 0.561 x EA/WR + 0.00900 x CO* _{2}* + 1.31 x AMO

R-Sq = 51.5%, standard error = 0.94ºC.

In this instance, 3 of the teleconnection terms are included for this month, as well as CO* _{2}* and the AMO. Assuming a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the September CET by 0.90ºC. 52% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The distribution of the data shows a **mean (standard deviation)** of 6.38 (1.43)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (1.97)ºC. This sets the likelihood range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the November data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the November temperature has risen from 1994 until 2017. It currently shows no sign of slowing down. Overall, in 167 years, the 30 yr temperature trend has risen by 1.07ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

November CET (ºC) = 7.17 + 0.424 x AO + 0.414 x EA + 0.536 x EA/WR + 1.25 x AMO

R-Sq = 55.1%, standard error = 0.88ºC.

The November data is influenced by teleconnections, and the Atlantic multidecadal oscillation. Including other variables in the empirical model results in worse fit and more error, so these are discounted. About 55% of the data is explained by this model – the rest is due to stochastic behaviour.

The distribution of the data shows a **mean (standard deviation)** of 4.47 (1.80)ºC. The plot shows the likelihood range of temperatures for this month.

The difference in temperature year to year shows a **mean (standard deviation)** of 0.00 (2.48)ºC. This sets the likelihood range for temperature changes year to year for this month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the December data since 1850 are:-

Temperature trends show rising, stable and falling patterns. Notice that the December temperature has recently shown temperatures at the extremes of both sides of its expected range, resulting in a stable 30 yr temperature trend since 2008, despite the rapid cooling and warming trends indicated. Overall, in 167 years, the 30 yr temperature trend has risen by 0.90ºC.

**AO, NAO, EA, EA/WR, SCAN**)**, ****CO_{2} **

**&**

**AMO**are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

December CET (ºC) = 4.89 + 0.967 x NAO + 0.194 x EA + 0.756 x EA/WR + 0.192 x SCA

R-Sq = 62.0%, standard error = 1.11ºC.

The December data is influenced by teleconnections, principally the North Atlantic Oscillation. Including other variables in the empirical model results in worse fit and more error, so these are discounted. About 55% of the data is explained by this model – the rest is due to stochastic behaviour.

Notice since this uses the monthly data since 1950 shown above the behaviour appears oscillatory. The red line shows the annual (12 month) trend, the blue line the decadel (120 month) trend.

The distribution of the data shows a **mean (standard deviation) **of 9.76 (4.64)ºC. The plot shows the likelihood range of temperatures.

The difference in temperature month to month across the years shows a** mean (standard deviation)** of 0.00 (2.82)ºC. This sets the likelihood range for temperature changes month to month.

Analysis by **cusums and 10/30 yr trend lines** suggest the **30 year trends** in the monthly data since January 1950 are:-

Temperature trends show stable and rising patterns. Overall, in 68 years, the 10 yr temperature trend has risen by 0.51ºC.

There is a strong correlation between the monthly temperature and the previous month’s insolation, as determined from the first graph shown **above**.

As discussed earlier, this is the principle driver of temperature through the year, and demonstrates the monthly lag between insolation and temperature. We use the label INSOL m-1 for this variable.

Least squares fit to the teleconnections (**AO, NAO, EA, EA/WR, SCAN**)**, CO2 , AMO**** **and** INSOL m-1 **are found by clicking on the bold text highlighted. The least squares fit to each variable are:-

**Correlations and best subsets regression** to the data since 1950 suggests the following relationship.

Monthly CET (ºC) = 1.14 + 0.0652 x INSOL m-1 + 0.354 x AO + 0.292 x NAO + 0.182 x EA + 0.321 x EA/WR + 0.00601 x CO* _{2}* + 1.23 x AMO

R-Sq = 93.2%, standard error = 1.22ºC.

The monthly data is influenced principally by the **lagged insolation**, also by teleconnections, principally the Arctic Oscillation. It also includes other variables, such as CO* _{2}* and the AMO. Assuming a linear (rather than logarithmic) effect of CO

*on warming exists, then increasing the annual level by 100 ppm will increase the Monthly CET by 0.60ºC. 93% of the data is explained by this model – the rest is due to stochastic behaviour.*

_{2}The following graph show the **current prediction for the monthly CET values**, assuming a CO* _{2}* background level of 400 ppm and the

**lagged insolation**taken at mid month from the

**first graph above**.

The dot assumes a zero level for variables such as AO, NAO, EA, EA/WR & AMO, with the bars above and below the central dot at a level of ±1 for variables such as AO, NAO, EA, EA/WR & AMO. The blue cross shows the value of the current 30 year trend for each month, as determined above by EWMA analysis. The dotted lines show the **±95% daily data values from the Met Office. **Given the assumptions made, the model appears reasonable. In practise, at any time the variables (AO, NAO, EA, EA/WR & AMO) for each month will have their own unique behaviour seen in their time series, and lie on the distribution range shown below (see Appendix 1 for more details on each of these). Some are correlated to an extent (e.g AO & NAO), others less so.

Note this model could account for volcanism effects if we know the reduction in lagged insolation each eruption causes and the length of time it applies for. Robock [2000] suggests a complex response for the Northern Hemisphere as a result of major volcanic eruption. The drop in solar insolation results in cooler summers. There are also rises in stratospheric temperatures, and changes to the polar vortex giving warmer winters.

This is expected from the annual model above. The level of solar insolation changes by approximately a factor of 8 winter to summer, so reducing the monthly insolation by 5% across the year as a result of a volcanic eruption would have a more significant effect on the summer month temperatures, since this term makes up a larger component of the monthly temperature compared to winter. Also as the monthly correlations to AO & NAO show, their contribution to temperature is less in summer compared to winter due to their reduced range for each variable. Any shift to more positive values for these variables would raise the winter temperatures proportionately compared to summer.

Discussion

The models for each monthly CET apply strictly to the Central England region. The annual monthly model suggests insolation, lagged by a month is the principle cause of the temperature cycle we see, as expected. The monthly regression models obviously neglect this, as the insolation is fixed for that month, and ascribe behaviour to the variables described above. dealing with only one months data at a time. The coefficients from the monthly analysis, selected by best subsets are tabled below, together with the temperature change calculated from the 30 year EWMA using the difference in the values from 1850 and 2017. For each month the coefficients in the table show either positive correlation (+), negative correlation (-) or unmarked (denoting not included) for that month’s CET model. The ‘strength’ of these correlations are given in the individual regression models for each month.

Generally a positive correlation implies that an increase in that variable would cause an increase in the monthly CET, and likewise a decrease in that variable would cause an decrease in the monthly CET. This is reversed if the correlation is negative.

There appear to be patterns here:-

The winter months (November through to March) all have strong positive correlations to the following teleconnections:- AO, NAO or both, the EA and EA/WR patterns, also some to the SCAN (+ and – correlations). Some also show a positive correlation to the AMO. None had CO* _{2}* selected by best subsets regression as a significant variable (see Appendix 2&3 for more comments on this). The temperature gain since 1850 is around the order of 1ºC. They have all recently shown a recent rapid increase in CET which stabilised since the middle 2000’s, with the exception of November, which continues to climb. Please refer to the EWMA time series for each month above for more details.

The summer months, May and June, have weaker positive correlations to either the AO or NAO , the EA/WR, and the AMO. All show correlation to CO* _{2}*, selected by best subsets regression as a variable. The temperature gains for these month since 1850 is around 0ºC. Both have shown a recent rapid increase in CET which stabilised since the late 2000’s. Again, please refer to the analysis for each month above for full details.

April and July to October also show weaker positive correlations to either the AO or NAO or both, and to the EA. Some months show positive correlation to the EA/WR, and some to the AMO. CO* _{2}* was selected by best subsets regression as a variable. The temperature gains for these month since 1850 is around 0.8ºC, with the exception of October which is 1.28ºC. Most have shown a recent rapid increase in CET which stabilised since the late 2000’s. The exception to this is October, which continues to climb in temperature, and has done so with pauses since 1897, but with no period of falls. To this extent, October is exceptional compared to all other months in the year, which have all shown some period of cooling since 1900. The strongest correlations for this month are to EA and EA/WR, so perhaps changes in the trends for these two variables may explain this (see Appendix 1). Please refer to the EWMA time series for each month above for full details.

There is a positive correlation to CO* _{2}* seen for most months by best subsets regression, with the exception of those in winter. Assuming there is a linear relationship between CO

*levels and monthly temperature, a 100 ppm rise in the annualised CO*

_{2}*level would produce a predicted temperature rise in the range of 0.60 to 1.99ºC. Refer to each month above for the exact details. The annual monthly model presented suggests 0.60ºC per 100 ppm rise is reasonable.*

_{2}CO* _{2 }*is currently climbing at the rate of 2 to 3 ppm per year, so the annual temperature increase would be in the range of 0.012ºC to 0.018ºC. This amount is ‘lost’ or barely perceptible in the year to year temperature change shown for each month, which are normally distributed, with a mean value about 0ºC and a standard deviation in the order of 1.5 to 2.5ºC (summer to winter). We shouldn’t be surprised if the monthly annual temperature changes by ± 3ºC to ±5ºC (summer to winter) at ±95% confidence levels (roughly 2 x the standard deviation). Annual temperature changes of this order neither prove a CO

*effect if there is an increase, nor disprove a CO*

_{2 }*effect if there is an decrease; the CO*

_{2 }*component is difficult to spot over this short time period as the apocryphal*

_{2 }**boiling frog**found out.

Given the length of the time series modelled from 1950 is 68 years, the models can be used as a basis for predicting behaviour for say +10% of the length of the current time series, i.e. another 7 years with some certainty, unless events around causes (5 & 8) occur. This is unlikely, so some 7 years time will take us to about 2025. The study will be repeated then, and the models re-established to see what changes have occurred to both the monthly CET time series, and their underlying causes by the same methods. It will be interesting to see if CO* _{2}* will then appear as a variable selected by best subsets regression for the winter months, and if the recent slowdowns in temperature change still persist. As the next seven years pass, time will tell.

Appendix 1 – Data used in models including CET, teleconnections, CO2 and Atlantic Multidecade Oscillation (AMO) since 1950.

This is described fully in the references listed below.

The monthly CET record is available **here**.

The monthly time series in this study uses the full data back to 1659 to establish the trends, by using an initial 10 pt average at the beginning, and letting the EWMA follow the data thereafter (see Appendix 2 below for details). By 1850, the time series is fully established by this process, and only the data from 1850 thereafter are shown.

*The term “teleconnection pattern” refers to a recurring and persistent, large-scale pattern of pressure and circulation anomalies that spans vast geographical areas. Teleconnection patterns are also referred to as preferred modes of low-frequency (or long time scale) variability. Although these patterns typically last for several weeks to several months, they can sometimes be prominent for several consecutive years, thus reflecting an important part of both the interannual and interdecadal variability of the atmospheric circulation. Many of the teleconnection patterns are also planetary-scale in nature, and span entire ocean basins and continents. For example, some patterns span the entire North Pacific basin, while others extend from eastern North America to central Europe. Still others cover nearly all of Eurasia.*

*Teleconnection patterns reflect large-scale changes in the atmospheric wave and jet stream patterns, and influence temperature, rainfall, storm tracks, and jet stream location/ intensity over vast areas. Thus, they are often the culprit responsible for abnormal weather patterns occurring simultaneously over seemingly vast distances. *

Based on this, the following teleconnections below have been used, with **both the Arctic and North Atlantic Oscillations** used.

*‘The Arctic Oscillation (AO) is a large scale mode of climate variability, also referred to as the Northern Hemisphere annular mode.*

*These maps show air pressure patterns on November 7, 2010 (left), when the Arctic Oscillation was strongly positive, and on December 18 (right), when it was strongly negative. These phases are the result of the whole atmosphere periodically shifting its weight back and forth between the Arctic and the mid-latitudes of the Atlantic and Pacific Ocean, like water sloshing back and forth in a bowl. (maps by Ned Gardiner and Hunter Allen, based on Global Forecast System data from the National Centers for Environmental Prediction.) *

*The AO is a climate pattern characterized by winds circulating counterclockwise around the Arctic at around 55°N latitude. When the AO is in its positive phase, a ring of strong winds circulating around the North Pole acts to confine colder air across polar regions. This belt of winds becomes weaker and more distorted in the negative phase of the AO, which allows an easier southward penetration of colder, arctic airmasses and increased storminess into the mid-latitudes.’*

The historical trends from the NOAA for the AO are shown below.

Here is the 3 month (red) and 2 year trends by **EWMA**.

The **cusum** highlights the sharp rise in the AO around August 1988 for a two year period and the sharp fall in November 2009.

In common with most teleconnections, the behaviour of the Arctic Oscillation is oscillatory with time, with a degree of variation month to month. The trend shows the positive phase as being red, and the negative phase being blue. The AO is positively correlated to the monthly CET models, so a positive phase will act to increase the temperature, and a negative phase reduce the temperature. It is one of the stronger variables in the winter months.

The behaviour was more negative in the 1950’s through to early 1960’s, and more positive from the mid 1980’s to mid 1990’s. Notice the strong negative phase in 2010, coincident with the coldest December in the UK for 100 years or so. Note the comments below on the worked example of February 2018.

Since 1950, each month’s AO data in the time series is normally distributed. For example here is the data for **June **and **December. **The mean has dropped to a lower value and the standard deviation increased in December wrt June, but both are still normally distributed.

*‘Many meteorologists consider the North Atlantic Oscillation to be a “regional subset” of the Arctic Oscillation, which operates across the whole Northern Hemisphere.’*

*‘The North Atlantic Oscillation (NAO) consists of a north-south dipole of pressures, with one centre located over Greenland and the other centre of opposite sign spanning the central latitudes of the North Atlantic between 35°N and 40°N (Azores).*

*The positive phase of the NAO reflects below-normal heights and pressure across the high latitudes of the North Atlantic and above-normal heights and pressure over the central North Atlantic, the eastern United States and western Europe. The negative phase reflects an opposite pattern of height and pressure anomalies over these regions. Both phases of the NAO are associated with changes in the intensity and location of the North Atlantic jet stream and storm track, and in large-scale modulations of the normal patterns resulting in changes in temperature and precipitation patterns often extending from eastern North America to western and central Europe.’*

Again oscillatory behaviour happens, with red being positive and blue being negative. See Moore [2013] for example for a deeper discussion of this behaviour.

Here is the 3 month (red) and 2 year trends by **EWMA**.

The **cusum** highlights the positions of the changes.

The NAO is positively **correlated** to the monthly CET models, so a positive phase will act to increase the temperature, and a negative phase reduce the temperature. It is one of the stronger variables in the winter months.

Like the AO, broadly the behaviour for the NAO was more negative in the 1950’s and more positive from the mid 1980’s to mid 1990’s. Notice the strong negative phase in 2010. Also see the comments on the worked example of February 2018 below.

Since 1950, each month’s NAO data in the time series is normally distributed. For example here is the data for **June **and **December**. The mean has dropped to a lower value but the standard deviation has kept the same in December wrt June. Both are still normally distributed.

*‘The East Atlantic (EA) pattern is the second prominent mode of low-frequency variability over the North Atlantic, and appears as a leading mode in all months. The EA pattern is structurally similar to the NAO, and consists of a north-south dipole of anomaly centers spanning the North Atlantic from east to west. The anomaly centers of the EA pattern are displaced southeastward to the approximate nodal lines of the NAO pattern. For this reason, the EA pattern is often interpreted as a southward shifted NAO pattern.*

*The positive phase of the EA pattern is associated with above-average surface temperatures in Europe in all months, and with below-average temperatures over the southern U.S. during January-May and in the north-central U.S. during July-October. It is also associated with above-average precipitation over northern Europe and Scandinavia, and with below-average precipitation across southern Europe.’*

*The EA pattern exhibits very strong multi-decadal variability in the 1950-2004 record, with the negative phase prevailing during much of 1950-1976, and the positive phase occurring during much of 1977-2004. The positive phase of the EA pattern was particularly strong and persistent during 1997-2004.*

Some months show CET temperatures are positively correlated to the EA pattern, some negative. A switch from a more a balanced to positive phase for this teleconnection has happened since 2013.

Here is the 3 month (red) and 2 year trends by **EWMA**.

The **cusum** highlights the positions of the changes.

Since 1950, each month’s EA data in the time series is normally distributed. For example here is the data for **June **and **December**. The mean has dropped to a lower value and the standard deviation has increased in December wrt June. Both are still normally distributed.

There is a less strong **correlation** to the AO, which is positive.

**East Atlantic / West Russia pattern**

*‘The East Atlantic/ West Russia (EA/WR) pattern is one of three prominent teleconnection patterns that affects Eurasia throughout year. The East Atlantic/ West Russia pattern consists of four main anomaly centers. The positive phase is associated with positive height anomalies located over Europe and northern China, and negative height anomalies located over the central North Atlantic and north of the Caspian Sea.’*

Here is the 3 month (red) and 2 year trends by **EWMA**.

The **cusum** highlights the positions of the changes.

There is a less strong **correlation** to the AO, which is positive.

The EA/WR term is positively correlated to the CET monthly regressions, so negative behaviour will act to suppress the CET. The behaviour has been more negative in recent years.

Since 1950, each month’s EA/WR data in the time series is normally distributed. For example here is the data for **June **and **December. **The mean has dropped to a lower value and the standard deviation has decreased in December wrt June. Both are still normally distributed.

*‘The Scandinavia pattern (SCA) consists of a primary circulation center over Scandinavia, with weaker centers of opposite sign over western Europe and eastern Russia/ western Mongolia.*

*The positive phase of the Scandinavia pattern is associated with below-average temperatures across central Russia and also over western Europe. It is also associated with above-average precipitation across central and southern Europe, and below-average precipitation across Scandinavia.’*

Here is the 3 month (red) and 2 year trends by **EWMA**.

The **cusum** highlights the positions of the changes.

There is a less strong **correlation** to the AO, which is negative.

Depending on the month, the SCA term is both positively and negatively correlated to the CET monthly regressions, so negative behaviour will act to suppress the CET. The behaviour has been more negative in recent years.

Since 1950, each month’s SCA data in the time series is normally distributed. For example here is the data for **June **and **December**. The mean has increased to a higher value and the standard deviation has increased in December wrt June. Both are still normally distributed.

*For the data 1850-1953, ice core data were adjusted to account for the geographical distribution of CO _{2} as a function of time. The means of data from 1958 until 2017 at sites (Mauna Loa and South Pole) were adjusted for the geographical inhomogeniety.*

The time series used joins the two data sets together. From 1950 to 1957 ice core data were used, and from 1958 onwards data were collected directly in the atmosphere. The trend is steadily upwards, driven by use of technology based on combustion of fossil fuels.

There is an annual variation in the CO* _{2 }*level due to assimilation by plants. Since this data isn’t available for the ice core data 1950 to 1958, only the annual level has been used in this study for each month. We are seeking to understand whether a trend exists with annual CO

*to the monthly CET, and what its strength is, so the annual CO*

_{2 }*is acceptable.*

_{2 }*The Atlantic Multi-decadal Oscillation (AMO) has been identified as a coherent mode of natural variability occurring in the North Atlantic Ocean with an estimated period of 60-80 years. It is based upon the average anomalies of sea surface temperatures (SST) in the North Atlantic basin, typically over 0-80N.*

The unsmoothed monthly data used in this study is available **here. **

Here is the 3 month (red) and 2 year trends by **EWMA**.

The **cusum** highlights the positions of the changes.

Since 1950, each month’s AMO data in the time series is normally distributed. For example here is the data for **June** and **December**. The mean is about the same and the standard deviation has decreased in December wrt June.

Appendix 2 – Statistical methods used, including SPC and multiple regression analysis for empirical models.

**Statistical process control (SPC)** is a methodology to determine whether a set of data are random in nature with a mean and standard deviation that is unchanging (stable process), or is subject to alterations in the underlying mean (drift) or error. Typically one tries to detect drift in the underlying mean is a priority, and then understand causally why this happened. There are a number of methods which can be employed, and a couple are used in this study. These are:-

Using cusum charting [6-9] from a dataset of n measurements, the mean (or target) value for the data is measured . The cusum is produced by calculating the following.

First we derive the difference from the first value to this mean

S1 = (x1-)

The difference from the second term to the mean is then added to the first difference

S2 = (x2-) + S1

and so on, up to the last datapoint xn. This cumulative sum of the differences when charted is revealing. If there is no change in underlying mean during a time period under consideration, then the cumulative sum of the error will move about zero. If there is a change in underlying mean, the cusum will drift away from zero. Caulcutt (1995) offers the following [9].

•* Zero slope on the chart indicates the current process mean is equal to the target or long term mean.*

*• A change in slope indicated a change in the process mean from the target or long term mean.*

*• A positive slope indicates the current process mean is higher than the target or long term mean.*

*• A negative slope indicates the current process mean is lower than the target or long term mean.*

• The value of the cusum can be used to determine whether the change in slope is significant, by using Goldsmith’s Test. [6-8].

In practice the shape of a cusum looks similar to a long term exponentially weighted moving average, as we shall see.

**Exponentially weighted Moving Averages (EWMA)**

Originally due to Roberts (1959), and described in detail by Wetherill & Brown (1991) this method uses a pair of EWMA, with one shorter term moving average and one longer term moving average.

Initially for each EWMA at time, t, in the data sequence the following is calculated

EWMAt+1 = α.Xt+1 +(1-α).Xt

α has a value set between 0 and 1. Typically if we wish for a moving average of N points, then α = 1 / (1+N). Thus for a ten point moving average, α = 1 / 11, and for a thirty point moving average, α = 1 / 31.

The EWMA’s are charted for each datapoint in a time series, and the pattern of the two is used to identify turning points in the data. If the underlying process mean is stable then the two EWMA’s will have similar values. If for example the process mean starts to drift up, so will both EWMA’s, but the shorter term will rise above the longer term one, indicating when the process started to drift. If the process mean stabilises at a new value, the shorter term value will also stabilise and the longer term one will eventually meet it. Likewise, if the process mean starts to drift down, so will both EWMA’s, but the shorter term will fall above the longer term one, indicating when the process started to drift. In practice, looking for when the two EWMA’s cross each other are signals for a change which can then be extrapolated back to the start point. This is often done in conjunction with the cusum method, and the time series analysis can be performed on the same plot.

The EWMA method is superior to simple moving averages for a number of reasons.

• If an initial short term average or target is used to initialise the chart (X1), then all the data can be used subsequently, rather than having to wait until enough points are available for the moving average.

• The EWMA is more sensitive to variance, resulting in a reduced time to detect changes.

Scientific relationships are examples of a model ‘pushed’ from first principles to explain a phenomena based on underlying variables.

In the example of the CET, we could say that for each month’s average there is an underlying relationship such that

Monthly CET = f(x1, x2, … xn) + ε

where the individual variables are x1 to xn and ε is a stochastic error. The exact relationship, if all is understood could be written down from first principles.

Empirical model building based on robust statistical principles offer an example of a model ‘pulled’ from the available data to explain a phenomena based on correlating variables. There is an acknowledgement that although correlation does not imply causality, the reverse must be true; causality will imply correlation. Thus variables should be selected which are suspected of having a mechanistic effect on the output.

Monthly CET = g(x1, x2, … xn) + ε

By this method it is not necessary to understand the complete mechanisms involved.

The scientific and empirical models should explain the available data and give some power of prediction regarding future behaviour if the values of each variable is known. Scientific models are more reliable the simpler the number of variables and more controlled the experimental conditions are. A chemical analogy might be what happens in a test tube with a simple reaction taking place. The behaviour on a chemical plant, where there are often more than one variable in a system subject to complex heating and mixing and possibly competing side reactions becomes much more difficult to predict from first principles. The ‘local’ conditions in a ‘global’ system need to be known, and they cannot readily be measured or extrapolated. To that extent, empirical models become useful the bigger and more complex the system becomes, and the more stochastic behaviour results in error. In the realm of chemical engineering, these empirical models often perform well compared to models ‘pushed’ from first principles, due to the complexity of the system.

This paraphrases Box and Draper [1987] observations

*‘Results from fitting mechanistic models have sometimes been disappointing because not enough attention has been given to discovering what is an appropriate model form. It is easy to collect data that never ‘place the postulated model in jeopardy’ and so it is common (e.g. in chemical engineering) to find different research groups each advocating a different model for the same phenomenon and each proffering data that ‘prove’ their claim.’*

In terms of empirical models, instead of identifying a single model based on statistical significance, such as correlation, best subsets regression shows a number of different models, as well as some statistics to help us compare those models to guide the selection process.

It assesses all variables and selects them either singly or in combination which results in significant increases in R^{2}, adjusted R^{2}, Mallows’ Cp and S (standard error of regression). A good model should have a high R^{2} and adjusted R^{2}, small S (standard error of regression), and a **Mallows’ Cp** close to the number of predictors in the model and the constant. Using the adjusted R^{2 }is recommended over R^{2 }for comparing models with different numbers of terms, since the adjusted value accounts for the reduced degrees of freedom when terms are added to the regression model.

To this extent, careful selection of likely variables which effect the output is required when construction an empirical model of this type, although the exact details of the mechanistic relationship is unnecessary.

Best subsets regression should result in an empirical model that conforms to **Occam’s razor;** i.e. when presented with competing hypothetical answers to a problem, we should select the one that makes the fewest assumptions, with the least overall error.

There is an important qualification in that both the scientific and empirical models explicitly perform poorly if there are ‘hidden’ variables not included. These can cause problems with prediction, so an open mind regarding other causes not currently included is always helpful. The analysis of residuals is an important method to detect if there are hidden variables present. The residuals (observed value – predicted value) plotted in time sequence or by variables used in the model can be used to determine if a pattern emerges other than random. the residuals should also be homoscedastic (residuals are equal across the range) rather than heteroscedastic (residuals are unequal across the range).

Appendix 3 – worked example** – ** **February CET since 1850**

Clear patterns can be seen by using a pair of EWMA, with one shorter term moving average (10 year = red) and one longer term moving average (30 year = blue), with the annual data marked by the black dotted lines.

The average temperature since 1850 is 4.15ºC (1.92) to 2dp’s.

Note that the distribution is more prevalent in the low and high tail regions than suggested by a normal distribution (skewness = -0.81, kurtosis = 0.69). The cusum is produced by calculating the following. First we derive the difference from the first value to this mean

S1 = (x1-)

The difference from the second term to the mean is then added to the first difference

S2 = (x2-) + S1

and so on. So for the first ten years of data we get the following, assuming a mean value of 4.15ºC:-

and so on…

**Looking at this cusum plotted as a time series, together with the 10/30 yr trend lines** suggest the following **30 year trends** in the February data.

Correlations and best subsets regression to the data since 1950 gives:-

Feb CET (ºC) = 4.63 + 0.377 x AO + 0.351 x NAO + 0.577 x EA + 0.289 x EA/WR – 0.390 x SCA + 1.27 x AMO

R-Sq = 56.0%, standard error = 1.28ºC.

The February data is influenced by all the teleconnections. Inclusion of the other variable (CO* _{2}*) in the empirical model results in worse fit and more error. An example of this is if we choose to force CO

*into the best subsets regression.*

_{2}The ‘best’ model includes some of the terms we found above minus the NAO & AMO. Notice that the degree of fit and standard error are worse than found earlier by excluding CO* _{2}*, i.e. R-Sq = 54.4%, standard error = 1.29ºC. On this basis, for this month, CO

*is best excluded.*

_{2}An example of why both the AO and NAO are important can be found as this post is being written. Britain is currently experiencing rapid cooling and prolonged snow and ice from easterlies (called the ‘Beast from the East’) as a result of a **sudden** **stratospheric** **warming.**

The resulting diversion of the jetstream

is reflected by plunging values for both the Arctic Oscillation.

and North Atlantic Oscillation.

These two parameters make up the strongest components in the prediction for the February CET above, and given their positive correlation to the February CET, the temperatures are plunging, and the view outside my house is

This isn’t quite Minnesotan winter levels, but **nithering** enough for a Brit at -4 Cº.** T’beast has bitten.**

Obviously this cold snap drags the average February CET downwards.

Inspection of the diagram above shows that in February 2018 the daily temps have hit the 90th percentile (high side) once, and the 10th percentile or lower (low side) twice. Thus the value for the average in any month contains ‘a lot of weather’. The regression models offer only a view based on the monthly average and don’t pick up the detail therein.

Sudden stratospheric warmings and subsequent northern hemisphere cooling are a regular phenomenon, recognised since the late 1950’s. 13 SSW events have happened in the month of February, and notably 5 have caused cold of the type we are experiencing now. These were in 1963 (CET= -0.7ºC), 1979 (CET= 1.2ºC), 1984 (CET= 3.3ºC), and 2018 (CET= 2.8ºC).

With the **February 2018 data** now available (2.9ºC), the EWMA trends are:-

The value for 2018 acts to bring down both the 10 and 30 year averages. The comment above written before the current cold snap ‘*The recent peak in the ten year data was reached in 2003, and shows signs of steady decline back to the 30 year trend line. This may signify a pause is developing’ *still stands, and we will have to wait to see how the trends continue.

**Apophenia** (the human tendency to see pattern in randomness) is best avoided with a statistical approach.

**References**

[1] ‘The mean temperature of Central England, 1698 to 1952’.

G Manley

Q.J.R. Meteorol. Soc., Vol. 79, pp 242-261 (1953).

[2] Central England Temperatures: monthly means 1659 to 1973.

G Manley

Q.J.R. Meteorol. Soc., Vol. 100, pp 389-405 (1974)

[3] ‘A new daily Central England Temperature Series, *1772-1991.’*

DE Parker, TP Legg, and CK Folland.

Int. J. Clim., Vol. 12, pp 317-342 (1992)

[**4**] http://www.leif.org/research/Climate-Change-My-View.pdf

[5] ‘Multidecadal Mobility of the North Atlantic Oscillation’

GWK Moore, IA Renfrew & RS Pickart

J Climate, Vol 26, pp 2453- 2566 (2013)

[6] Mathematical and Statistical Techniques for Industry:- No. 3

Cumulative Sum Techniques

RH Woodward & PL Goldsmith

Oliver & Boyd 1964

[7] BS5703:Parts 1-4

Data analysis and quality control using cusum techniques

BSI 1980 -1982

[8] Statistics in Research and Development

Roland Caulcutt

Chapman and Hall, 1991

[9] Achieving Quality Improvement: A Practical Guide

Roland Caulcutt

Chapman and Hall, 1995

[10] Control Chart Tests Based on Geometric Moving Averages

S. W. Roberts

Technometrics, Vol. 42, No. 1, pp. 97-101 (1959)

[11] Statistical Process Control – Theory and Practice

G Barrie Wetherill & Don W Brown

Chapman and Hall, 1991

[12] Empirical model building and response surfaces

George EP Box and Norman R Draper

Wiley series in probability and mathematical sciences

J Wiley & Sons 1987

[13] ‘A dynamical model of the stratospheric sudden warming’

T Matsuno

J. Atm. Sci ,Vol. 28, pp. 1479-1494 (1971)

[14] ‘A sudden stratospheric warming compendium’

AH Butler et al

Earth Syst. Sci. Data, Vol. 9, pp.63-76 (2017)

[15] ‘Cycles and trends in solar irradiance and climate’

JL Lean

Climate Change, Vol. 1, pp. 111–122 (2010)

[16] ‘Evolution of solar radiative effects of Mount Pinatubo at ground level’

L. Alados-Arboledas et al.

Tellus B: Chemical and Physical Meteorology, Vol. 49, pp. 190-198 (1997)

[17] ‘Volcanic Eruptions and Climate’

A Robock

Reviews of Geophysics, Vol. 38, pp 191-219 (2000)

It is nice to see someone using SPC to analyse the UK Weather.

Funnily enough they talk about the Climate being “Chaotic”, but looking at your data, it actually looks quite well controlled in the long term.

The only thing really missing from your analysis is Major Wind Direction Changes.

You have talked about the “Beast from the East”, but as you know we also suffer from South Easterlies and Southerlies bringing very warm weather, particularly in the South East where I used to live. Whether they reach far enough to influence the CET I am not sure.

I now live in Swansea in Wales and the weather is dominated by Sou Westerlies & Westerles, with occasionally Northerlies as at the moment and very occasionally Easterlies.

Maybe talk to you again over at WUWT or do you also visit Paul Homewood’s Not a lot of people know that.

By:

A C Osbornon March 5, 2018at 2:40 pm

Thank you – I think SPC is entirely the right approach to use. Given that the annual yearly differences are essentially stationary, it would appear that detecting drift by EWMA/Cusums is correct. It’s what’s used in the Chemical industry routinely.

The links to each months weather by Teleconnections would suggest positive AO/NAO for warmer weather; conversely negative AO/NAO for colder weather. The same pattern holds for the EA, with positive EA for warmer weather etc. There has been a slow persistent shift in the EA to more positive values over decades, and if this reverses, we could be in for colder times. I will shortly add to this post the EWMA data for the teleconnections, in addition to the colour plots from NOAA I referenced. If the relationships to teleconnections are true, then the CET does look a lot like a process with cyclical inputs going at different rates, driving the weather with a large degree of randomness. But then, as Brits, we know that?

By:

mynaturaldiaryon March 5, 2018at 4:45 pm

I forgot to ask, what do the Control Limits look like for this “process” and how many out of control points are there?

It also makes you wonder what it is that brings them back in to control.

The long term trend looks a bit like unadjusted Tool Wear on a machine.

By:

A C Osbornon March 5, 2018at 2:44 pm

SPC can only strictly be applied to a time series that is both normally distributed and with a stable mean and standard deviation. In the case of the CET data, neither of these assumptions are true. Consequently we can look to ‘control deviation’ from the use of Cusums, which coupled with EWMA detect where changes in trend occur. Looking at how far data points are from the 30 year trend help identify ‘unusual’ behaviour. In the case of the models established, these would be points that are greater than +/- 95% from the model, determined from the models residual error. You might expect a ‘few’ of these per 100 data points at this confidence level. If over half of them are this far from the predictions, then it’s time to assess either the model or the data.

I added btw an annual model which captures the cycle in temperatures (due to insolation, lagged by one month).

By:

mynaturaldiaryon March 19, 2018at 11:01 pm

I have used cusums for assessment of climate-related data since 1992 and continue to do so on a daily basis (eg today). This is a most interesting article. I would /really/ like to be able to get in touch with you directly, especially because I am involved in development of a paper which makes extensive use of cusums and uses them to demonstrate some important points. In the past my main interest has been in CET data, about which I have considerable experience. I use my own stats software, running under RISC OS for all analyses and graphics.

Robin

By:

robinedwards36on March 25, 2018at 7:03 pm

That sounds good – send me your email and I will get in contact.

I’m an ICI statistician , as well as a physical property scientist, still working in the UK Chemicals Industry, where I work with both SPC problems, and empirical models for chemical engineering studies!

By:

mynaturaldiaryon March 26, 2018at 9:05 am

Hello, Insubstantial Pageant,

Thanks for your reply.

I read your long article with increasing interest as I progressed, and am now pleased to have a little background about you.

I am a very long retired ex chemist turned amateur statistician – worked for Shell Research all my “career” in various locations. I noticed the elegance of ANOVA in 1956 when working in Amsterdam on epoxy resins (Brownlee’s book) and began to use some stats with nothing other than a hand mechanical calculator to help with the arithmetic. Couldn’t tackle regression except in the simplest (small) cases. I saved myself much “practical” work though by careful design, which I continued to use in the 1960s after returning to the UK to work on detergent formulation. This took me to Emeryville (California) for a couple of years where my interest in stats design put me in close contact with world class people who were anxious to help. Returning the UK my job changed to looking after our computer terminals in a Tech Service lab, and also helping design many experiments followed by their analysis. So the advent of programmable calculators was a magic moment, and I wrote stuff for them for assorted stats techniques, enjoying the rapid improvement of these calculators and understanding about compact coding. Then the first desktop computers arrived, with BASIC, and I got a Commodore PET with 8K of RAM, for which I wrote much more stuff – multiple regression with lots of unusual extras, such as diagnostics, and nonlinear regression amongst other things. For this program, which I called FIRST – Fully Interactive Regression STatistics – I needed access to things like values from the t table, which I did not know how to approach from the analytical side (my maths is pretty hopeless!) but I was able to use my nonlinear analysis to devise very good approximations to the t table. These were a lot better than the published ones, and similarly I improved the Abramowitz and Stegun normal approximation. I got a bigger computer in due course and was able to put together a really viable analysis program. Retiring early, 1984, I rewrote the stuff for the BBC Model B, and a couple of years later was persuaded to try to sell it. This developed into selling a hundred or so a year to clients from private people and students to site licences to departments at Oxford, Cambridge, other universities, and the NHS, who were my main customer for a few years. All this was helped by the Acorn range of machines – I was a registered developer – and by meeting a brilliant young man who understood real programming, and who re-wrote my spaghetti code (which was to his surprise 100% robust) into a linear system and added greatly improved graphics and of course mouse control instead of two-letter instructions. Of course, Acorn left the field and concentrated on the ARM chip, and we were effectively abandonned, though I am pleased to say reviving. All my stuff works under RISC OS, which I use as a program on my Windows 7 laptop, on which it runs brilliantly. My code is very compact indeed. The complete software is around 300k of code. It has a wide range of stats that I knew nothing of when I was working, eg non-parametrics, and quite a lot of stuff useful in the medical field. It now can handle fairly large data sets, by which I mean for example that I plotted yesterday a scatter diagram for 24000 rows of data (from climate of course), which it does in a very few seconds.

I noticed cusums in about 1990, when I was in touch with the person who wrote the BS standards. Trying them on data from Greenwich/Kew, from about 1840 onwards, I was surprised to find a remarkable discontinuity in the cusum plot. I followed this up with people at Kew who checked that there had been no change in method or location that they could find for the date in question, some time 1932 if I remember correctly, so I suspectedthat a real change might have occurred. I had to type all the data in, about 1800 values! No internet of course at that time, in my case anyway. I found a few more data sets, and they too seemed to have possible discontinuities. Interesting! Then I got in touch with Phil Jones at UEA, who was then just a PhD person, and he kindly sent me a floppy disc with about 15 data sets for various places round the world, extracted from the CD that they were selling for £700. These too show that climate cusums contained obvious discontinuities. This was around 1992.

So I gradually set off to look at climate data in general, and spoke about some of my findings at an Acorn computer show in Olympia. I have not discussed one of these findings with anyone since, but still think it could be of considerable interest to some climatologists.

With the advent of the WWW data became very widely available, and I’ve worked on myriads of sets, but have not tried to publish except in WUWT and even there there seems to be no interest in step changes of climate. More recently I’ve been in touch with a couple of academics, one in Siberia and recently someone much more local – a professor of oceanography at Plymouth – who is really excited by what I have been able to demonstrate to him with his data, and he is now busily preparing a paper for publication. This based for a large part on a hitherto unremarked step change in European temperatures in 1987. This is so obvious to me (and has been for many years) that I cannot understand why climatologists have not spotted it. What is lacking is any sort of explanation for this change, and maybe that’s why.

If you like I’ll send you a few GIFs illustrating what I do. Just let me know.

I would advise retiring as soon as you can! I’ve been retired for as long as I worked ~ 34 years and really enjoyed it!

Cheers, Robin (Bromsgrove)

By:

robinedwards36on March 26, 2018at 1:39 pm

Yes – please do send some gifs through. The step change you mentioned is in the data sets above (August is a good example). The regression models suggest changes to AO, EA may be responsible for some of this (see the EWMA trends).

What I hadn’t appreciated until I did this study was that the teleconnections ‘drifted’ – they weren’t in a state of SPC, which would cause slow trends in ‘weather’. The cause for this is another story…

The EWMA with two time signal signals is another complementary method to Cusums, and I can send an example excel dataset if you wish.

What a great personal history you have!

By:

mynaturaldiaryon March 29, 2018at 7:48 am

[…] In an earlier post I introduced the idea of doing time series analysis. The basic techniques are explained here. […]

By:

Trends for birds in Teesmouth (technical!) | This Insubstantial Pageanton October 1, 2019at 1:55 pm