Statistical Methods for long-range forecast By Syunji Takahashi Climate Prediction Division JMA Lets thinking chaos of our atmosphere using Lorenz system Lorenz equations dX 10 X 10Y dt dY 28 X Y XZ dt dZ 8 Z XY dt 3 nonlinear equations : approximated equations of convection : representing the essential nature of our atmosphere (Chaotic feature) : easy to solve it by PC X : a component of stream function Y,Z: two components of temperature Lorenz system making Chaos ! 25 60 20 15 10 z

5 30 0 -5 - 10 - 15 0 - 30 - 20 0 30 - 25 x 1 101 201 301 401 501 601 701 801 901 Trajectory of solution

Time series of solution X on X-Z plain Two solution with slight different initials Features of solution Circling around two attractors (Lorenz attractor) alternatively With no certain period Small difference becoming greater soon (Chaos) Predictability Problem 25 20 15 Predicting the average value within the period 10 5 0 -5 - 10 - 15 - 20 - 25 1 101 201 301 401 501 601

Probability Density 6 Predicted Value 801 901 Probability Density of Predicted Value Initial Disturbance and predicted Value 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 - 0.6 701 0.5 0.4 0.3 0.2 0.1 0 - 0.4 - 0.2 0

0.2 Initial Disturbance 0.4 0.6 -6 -4 -2 0 2 Predicted Value 4 6 Statistics of 3000 times simulations with small disturbance generated stochastically Predictability of the Second Kind 25 X X0 dX dt T Y Y0 dY dt

T Z Z0 dZ dt T 20 15 10 5 0 -5 - 10 - 15 - 20 1 101 201 301 401 501 601 Probability Density 6 Predicted Value 801 901 Solution of Lorenz system with a forcing

Probability Density of Predicted Value Initial Difference and Predicted Value 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 - 0.6 701 0.5 0.45 0.4 0.35 0.3 Forcing generates bias in the solutions 0.25 0.2 0.15 0.1 0.05 0 - 0.4 - 0.2 0 0.2

Initial Disturbance 0.4 0.6 -6 -4 -2 0 2 Predicted Value 4 6 Signal Why Probability function becoming normal ? Central limit theory Mean of any stochastic variables becoming to be normally distributed Probability Density Probability Density of Predicted Value 0.5 (not strictly speaking) 0.4 0.3 0.2

Central Dogma of Statistic 0.1 0 -6 -4 -2 0 2 Predicted Value 4 6 1 x N N N x i i 1 P( x ) 1 ( x )2 exp 2 2 2

Chaotic feature of the atmosphere and long-range Forecast In both numerical and statistical prediction 4 We cant predict the long-term future precisely. 5 Possible target of long-range forecast is biased state caused by boundary forcing. 6 Possible target of long-range forecast is averaging state. 7 Probabilistic forecast is essential. 8 Noise can be assumed to be normally distributed. History of long-range forecast at JMA and statistical methods 1942 1943 starting long-range forecasting frequent cool summers formal issuing 1 month, 3 months, seasonal alysis harmonic an criticism on the accuracy 1949

division being closed frequent unusual weather 1953 restarting long-range forecast increasing upper sounding data simple regression method 1974 establishment of long-range forecast division analog method increasing demand for climate informatio n 1987 publish of monthly report of climate system alysis multiple regression method spectral an Statistical methods at JMA 1 Analog Method Cluster Analysis not being used 2 How is they similar ? Spectral or Harmonic Methods not being used sometimes good, but not al ways

3 Optimal Climate Normal (OCN) now being used 4 Multiple Regression Analysis simple ! Simple Regression not being used 5 available technique Canonical Correlation Analysis (CCA) now being used fashionable technique ! Concept of Analog Method Basis of analog method method similar states will evolve similarly and similar in t he future Analog method method using method 500hPa patterns Searching method past years which similar to the targ methodet y ear in 500 hPa pattern, to predict the future of the to predict the future of the year using method the past futures of these similar years Selecting method 10 similar years, to predict the future of the frequency distribution of these temperatures in the past futures is consid ered as a probability forecast. Definitions of Similarity or Distance Forecasting year Similar Euclid distance 1 d ( xi yi ) 2 ( x y ) t ( x y ) Similar Correlation coefficient

r Different definition, different results! (x i x )( yi y ) 2 ( x x ) i 2 ( y y ) i Cluster Analysis grouping method Gathering the nearest pairs of member or group Various definitions of the distance between groups Distance Space Example of Cluster Analysis Data: 1971-2000 sequences of monthly temp.(Feb.) Distance: unity minus correlation coefficient

Group distance: Word method (It is my favorite) 2.5 Dendro-gram 2 1.5 1 0.5 0 0 1 AUSTRA INDIA 2 3 NEW CAL PERU 4 5 VENEZ PHILIP 6 7 8 THAI(2) KAZAK 9

10 THAI(1) J APN 11 Analysis of time series Sometimes, obvious cycle appears in the sequence of meteorological element. In that case, prediction using the periodicity is very efficient. 3 month mean temperature of Eastern J apan Power spectrumof 3 month mean temp. 800 20 700 15 600 Power(100* * 3 months mean temperature(0.1) 25 10 5 0 500 400 300 200 -5 100

- 10 0 0 10 20 30 40 50 accumulated month 60 70 80 0 0.05 0.1 0.15 0.2 0.25 Frequency(1/ month) 0.3 0.35 0.4

Prediction by Auto-Regression Model Assuming Auto-Regression Model such as xi a1 xi 1 a2 xi 2 am xi m i i N (0 , 2 ) Determining the coefficients and variance of noise from past data. We can predict the future as xt 1 a1 xt a2 xt 1 am xt m1 i And so on Successive Case 3 month mean temperature and prediction 3 months mean temperature(0.1) 25 20 15 10 5 0 -5 - 10 0 10 20 30 40 50 accumulated month 60 70

80 Failure Case 3 month mean temperature and prediction 3 months mean temperature(0.1) 25 20 15 10 5 0 -5 - 10 0 10 20 30 40 50 accumulated month 60 70 80 30 20 10 0 - 10 - 20 - 30

1960 Temp. deviation(0.1) Temp. deviation(0.1) Optimal Climate Normal (OCN) 30 Sequence of monthly mean temp. in Eastern J apan (J an.) Normal, past 30 years mean is not always optimal first estimate in the case of being obvious increasing or decreasing trend 1970 1980 1990 2000 Sequence of monthly mean temp. in Eastern J apan (J an.) Investigating the past data, 20 10 years mean is the optimal first estimate in both temp. and precip. 10 0 - 10 of Japan. - 20 - 30

1960 or climatic jump. 1970 1980 1990 2000 Break Time for 10 minutes EXCELL files of Chaos, Cluster analysis, and Spectral analysis are prepared in this PC. Situation of Multiple Regression Model predictand year Temp. 1980 - 13 1981 8 1982 - 21 1983 - 11 1984 7 1985 5 1986 - 12 1987 9 1988 - 19 1989 -9 1990

6 1991 5 1992 0 1993 - 21 1994 24 1995 5 1996 6 1997 4 1998 5 1999 4 2000 15 predictors NHPV - 4.0 - 4.9 1.9 - 25.4 0.7 16.4 - 14.8 20.2 8.7 - 18.4 25.1 5.6 - 20.8 30.5 - 33.2 - 6.4 - 32.5 - 3.3 43.5

3.2 - 3.1 FEPV - 9.2 8.5 13.7 - 4.8 14.4 16.6 - 12.2 21.1 14.8 - 58.7 19.1 31.4 - 42.9 - 12.0 - 39.8 3.9 - 27.9 - 8.7 33.6 19.5 - 9.2 Predictand Vector NHZI 6.1 - 13.6 2.1 - 12.7 1.0 10.2 5.6 - 0.4 3.2 - 17.3 10.6 - 7.2 3.6

- 13.9 14.1 - 12.8 4.0 - 17.4 2.1 14.9 - 0.1 FEZI - 24.4 - 4.8 11.1 - 23.9 7.6 5.5 - 28.3 3.6 11.2 0.8 3.5 - 28.5 - 0.8 - 24.8 29.4 - 3.5 6.2 12.1 - 27.1 43.6 - 2.4 OKHOTK 27.9 - 1.2 - 8.2 - 6.1 - 19.9 - 20.5 11.1 - 12.2 33.8

26.1 - 4.9 - 0.8 - 3.9 7.1 - 14.2 - 15.5 1.4 - 5.3 50.0 - 1.6 4.1 Independent Data MIDH 0.4 0.3 - 11.5 - 11.6 - 5.6 - 6.8 - 10.3 - 2.7 2.6 2.2 4.0 - 0.8 - 4.1 - 12.7 13.8 - 1.0 5.2 1.5 14.2 9.2 4.6 OKINAW 13.8 - 3.3 - 21.4

- 3.9 - 6.6 - 12.8 - 8.7 5.9 - 2.6 - 7.1 9.5 9.6 - 5.0 - 3.0 8.0 11.2 10.6 - 4.5 14.5 - 6.6 - 7.4 OGASH 9.6 - 5.0 - 3.8 2.9 - 21.9 - 11.6 - 15.2 8.5 1.3 - 1.7 1.5 6.1 3.7 2.8 - 0.5 11.6 - 3.4 - 2.3 16.0 - 0.6 - 4.4 WPAH

0.6 - 2.1 6.5 14.9 1.9 - 1.7 - 2.4 0.0 7.4 - 0.6 - 2.3 - 0.9 1.9 9.3 2.8 3.9 2.8 0.4 - 5.1 - 14.3 - 15.1 Predictor Matrix Multiple Regression Equation The multiple reg methodression model assumes predictand vector is sum of a linear combination of predictors and a noise. 1 y (t ) 0 1 x1 (t ) m 1 xm 1 (t ) This can be rewritten as y (t ) X Su 1 x11 x1,m 1

X 1 x x n1 n ,m 1 1 1 m Determine Regression Coefficients The coefficient vector is usually estimated so as to minimize the sum of squared errors of predicted vector. 2 12 S y y y Xa minimize t ( X X )a X y t 1 t 1 a ( X X ) X y t Visual Image of Multiple Regression 1 y y1 ,, y n Predictand Vecto r

Residual Vector 1 y y1 ,, y n Xa Predicted Vector Orthogonal Projection 1 y Xa y to S(X) Subspace S(X) 1 m S ( X ) z : z X , V Visual Image of Multiple Regression 1 ( I P ) Residual Vector 1 X True Regression Vector 1 2 N 0, Error Vector Projection of Error Vector t 1 t 1

P X ( X X ) X Property of Regression Residual S 2 n m 2 E S m n 2 S n m 2 FPE n m 2 nm FPE S n m Mean Temperature () Detecting Trend Using Simple Regression 9 Sequence of winter mean temp. in Tokyo, to predict the future of the J apan 8 7 6 5 4 3 1940

1950 1960 y a0 a1 x 1970 1980 1990 2000 a (/ year) 1 Property of Calculated Trend Confidence Interval of the estimated trend t n 2 S a1 1 t n 2 n 2 xx Confidence Interval of the regression line t n 2 1 x x 2 S xx n n 2 S n 2 xx Mean Temperature () Property of Calculated Trend 9

Sequence of winter mean temp. in Tokyo, to predict the future of the J apan trend 4.90 (/ 100 years) 8 7 6 Confidence Interval of the regression line 5 4 3 1940 1950 1960 1970 1980 1990 2000 Confidence Interval of the estimated trend 3.6 trend 6.10 (/ 100 years) Warming trend in Tokyo is significant Urbanization and Warming Trend Temperature Trend and Urbanization Index Temperature Trend (/100years) 3.0 2.5

2.0 1.5 1.0 0.5 0.0 0 20 40 Housing method- Land Ratio (%) 60 Estimation of Global Warming Trend sec tion 0.65 (/ 100 years) Confidence Interval of the section (constant) t n 2 1 x2 1 x2 S S n xx a t n 2 n xx 0 0 n 2 n 2 0.47 sec tion 0.82 (/ 100 years) Global warming trend and its confidence interval can be estimated even using the data effected urbanization Why is CCA Currently Used ? Disadvantages of Multiple Regression method 1 Covariance matrix being method sing methodular X X

t sing methodular (X X ) t 1 not calculated 2 Not taking method accounts of the correlations among method th e predictands Reg methodressions of many predictands are independe ntly determined. CCA Flow Chart X U A P re d ic to r M a trix P re d ic to r M a trix C a n o n ic a l V a ria b le (R e a l S p a c e ) (E O F S p a c e ) (C C A S p a c e ) M u ltip le R e g re s s io n Y X ( X t X ) 1 X tY Y

C C A P re d ic tio n Y HU B V U V P re d ic ta n d M a trix P re d ic to r M a trix C a n o n ic a l V a ria b le s (R e a l S p a c e ) (E O F S p a c e ) (C C A S p a c e ) Transform Real Space to EOF Space S ( X ) Kernel ( X ) t X t XE ED D diag (1 ,, p ) XX t A AD A XED 1

2 S ( X ) S ( XX ) S ( A) t P X ( X t X ) 1 X t AAt Determining Canonical Component S (Y ) S ( B) v u Ar v Bs t AA v u 1 u S ( X ) S ( A) t u u 1 t v v 1 1 BB u v t Summary of Multiple Regression and CCA 2 Multiple regression and CCA are fashionable tools ,which a re available to treat bulk data. 3

Selection of variable is very important for successive predic tion using independent data. Stepwise and all-subsets m ethods are available. 4 Rank deficiency problem can be avoided by transformation the real data to one in EOF space in both multiple regressio n and CCA case. 5 Correlation between predictands can be considered in CCA. CCA is the most fashionable tool. Thank you