Nonlinear Generation of Power Spectrum : ENSO

Something I learned early on in my research career is that complicated frequency spectra can be generated from simple repeating structures. Consider the spatial frequency spectra produced as a diffraction pattern produced from a crystal lattice. Below is a reflected electron diffraction pattern of a reconstructed hexagonally reconstructed surface of a silicon (Si) single crystal with a lead (Pb) adlayer ( (a) and (b) are different alignments of the beam direction with respect to the lattice). Suffice to say, there is enough information in the patterns to be able to reverse engineer the structure of the surface as (c).

from link

Now consider the ENSO pattern. At first glance, neither the time-series signal nor the Fourier series power spectra appear to be produced by anything periodically regular. Even so, let’s assume that the underlying pattern is tidally regular, being comprised of the expected fortnightly 13.66 day tropical/synodic cycle and the monthly 27.55 day anomalistic cycle synchronized by an annual impulse. Then the forcing power spectrum of f(t) looks like the RED trace on the left-side of the figure below, F(ω). Clearly that is not enough of a frequency spectra (a few delta spikes) necessary to make up the empirically calculated Fourier series for the ENSO data comprising ~40 intricately placed peaks between 0 and 1 cycles/year in BLUE.

click to expand

Yet, if we modulate that with an Laplace’s Tidal Equation solution functional g(f(t)) that has a G(ω) as in the yellow inset above — a cyclic modulation of amplitudes where g(x) is described by two distinct sine-waves — then the complete ENSO spectra is fleshed out in BLACK in the figure above. The effective g(x) is shown in the figure below, where a slower modulation is superimposed over a faster modulation.

So essentially what this is suggesting is that a few tidal factors modulated by two sinusoids produces enough spectral detail to easily account for the ~40 peaks in the ENSO power spectra. It can do this because a modulating sinusoid is an efficient harmonics and cross-harmonics generator, as the Taylor’s series of a sinusoid contains an effectively infinite number of power terms.

To see this process in action, consider the following three figures, which features a slider that allows one to get an intuitive feel for how the LTE modulation adds richness via harmonics in the power spectra.

  1. Start with a mild LTE modulation and start to increase it as in the figure below. A few harmonics begin to emerge as satellites surrounding the forcing harmonics in RED.
drag slider right for less modulation and to the left for more modulation

2. Next, increase the LTE modulation so that it models the slower sinusoid — more harmonics emerge

3. Then add the faster sinusoid, to fully populate the empirically observed ENSO spectral peaks (and matching the time series).

It appears as if by magic, but this is the power of non-linear harmonic generation. Note that the peak labeled AB amongst others is derived from the original A and B as complicated satellite-cross terms, which can be accounted for by expanding all of the terms in the Taylor’s series of the sinusoids. This can be done with some difficulty, or left as is when doing the fit via solver software.

To complete the circle, it’s likely that being exposed to mind-blowing Fourier series early on makes Fourier analysis of climate data less intimidating, as one can apply all the tricks-of-the-trade, which, alas, are considered routine in other disciplines.


Individual charts

https://imagizer.imageshack.com/img922/7013/VRro0m.png




 


Overfitting+Cross-Validation: ENSO→AMO

I presented at the 2018 AGU Fall meeting on the topic of cross-validation. From those early results, I updated a fitted model comparison between the Pacific ocean’s ENSO time-series and the Atlantic Ocean’s AMO time-series. The premise is that the tidal forcing is essentially the same in the two oceans, but that the standing-wave configuration differs. So the approach is to maintain a common-mode forcing in the two basins while only adjusting the Laplace’s tidal equation (LTE) modulation.

If you don’t know about these completely orthogonal time series, the thought that one can avoid overfitting the data — let alone two sets simultaneously — is unheard of (Michael Mann doesn’t even think that the AMO is a real oscillation based on reading his latest research article called “Absence of internal multidecadal and interdecadal oscillations in climate model simulations“).

This is the latest product (click to expand)

Read this backwards from H to A.

H = The two tidal forcing inputs for ENSO and AMO — differs really only by scale and a slight offset

G = The constituent tidal forcing spectrum comparison of the two — primarily the expected main constituents of the Mf fortnightly tide and the Mm monthly tide (and the Mt composite of Mf × Mm), amplified by an annual impulse train which creates a repeating Brillouin zone in frequency space.

E&F = The LTE modulation for AMO, essentially comprised of one strong high-wavenumber modulation as shown in F

C&D = The LTE modulation for ENSO, a strong low-wavenumber that follows the El Nino La Nina cycles and then a faster modulation

B = The AMO fitted model modulating H with E

A = The ENSO fitted model modulating the other H with C

Ordinarily, this would take eons worth of machine learning compute time to determine this non-linear mapping, but with knowledge of how to solve Navier-Stokes, it becomes a tractable problem.

Now, with that said, what does this have to do with cross-validation? By fitting only to the ENSO time-series, the model produced does indeed have many degrees of freedom (DOF), based on the number of tidal constituents shown in G. Yet, by constraining the AMO fit to require essentially the same constituent tidal forcing as for ENSO, the number of additional DOF introduced is minimal — note the strong spike value in F.

Since parsimony of a model fit is based on information criteria such as number of DOF, as that is exactly what is used as a metric characterizing order in the previous post, then it would be reasonable to assume that fitting a waveform as complex as B with only the additional information of F cross-validates the underlying common-mode model according to any information criteria metric.

For further guidance, this is an informative article on model selection in regards to complexity — “A Primer for Model Selection: The Decisive Role of Model Complexity

excerpt:

ESD Ideas article for review

Get a Copernicus login and comment for peer-review

The simple idea is that tidal forces play a bigger role in geophysical behaviors than previously thought, and thus helping to explain phenomena that have frustrated scientists for decades.

The idea is simple but the non-linear math (see figure above for ENSO) requires cracking to discover the underlying patterns.

The rationale for the ESD Ideas section in the EGU Earth System Dynamics journal is to get discussion going on innovative and novel ideas. So even though this model is worked out comprehensively in Mathematical Geoenergy, it hasn’t gotten much publicity.

Gravitational Pull

In Chapter 12 of the book, we provide an empirical gravitational forcing term that can be applied to the Laplace’s Tidal Equation (LTE) solution for modeling ENSO. The inverse squared law is modified to a cubic law to take into account the differential pull from opposite sides of the earth.

excerpt from Mathematical Geoenergy (Wiley/2018)

The two main terms are the monthly anomalistic (Mm) cycle and the fortnightly tropical/draconic pair (Mf, Mf’ w/ a 18.6 year nodal modulation). Due to the inverse cube gravitational pull found in the denominator of F(t), faster harmonic periods are also created — with the 9-day (Mt) created from the monthly/fortnightly cross-term and the weekly (Mq) from the fortnightly crossed against itself. It’s amazing how few terms are needed to create a canonical fit to a tidally-forced ENSO model.

The recipe for the model is shown in the chart below (click to magnify), following sequentially steps (A) through (G) :

(A) Long-period fortnightly and anomalistic tidal terms as F(t) forcing
(B) The Fourier spectrum of F(t) revealing higher frequency cross terms
(C) An annual impulse modulates the forcing, reinforcing the amplitude
(D) The impulse is integrated producing a lagged quasi-periodic input
(E) Resulting Fourier spectrum is complex due to annual cycle aliasing
(F) Oceanic response is a Laplace’s Tidal Equation (LTE) modulation
(G) Final step is fit the LTE modulation to match the ENSO time-series

The tidal forcing is constrained by the known effects of the lunisolar gravitational torque on the earth’s length-of-day (LOD) variations. An essentially identical set of monthly, fortnightly, 9-day, and weekly terms are required for both a solid-body LOD model fit and a fluid-volume ENSO model fit.

Fitting tidal terms to the dLOD/dt data is only complicated by the aliasing of the annual cycle, making factors such as the weekly 7.095 and 6.83-day cycles difficult to distinguish.

If we apply the same tidal terms as forcing for matching dLOD data, we can use the fit below as a perturbed ENSO tidal forcing. Not a lot of difference here — the weekly harmonics are higher in magnitude.

Modified initial calibration of lunar terms for fitting ENSO

So the only real unknown in this process is guessing the LTE modulation of steps (F) and (G). That’s what differentiates the inertial response of a spinning solid such as the earth’s core and mantle from the response of a rotating liquid volume such as the equatorial Pacific ocean. The former is essentially linear, but the latter is non-linear, making it an infinitely harder problem to solve — as there are infinitely many non-linear transformations one can choose to apply. The only reason that I stumbled across this particular LTE modulation is that it comes directly from a clever solution of Laplace’s tidal equations.

for full derivation see Mathematical Geoenergy (Wiley/2018)

Machine Learning of ENSO

This topic will gain steam in the coming years. The following paper generates quite a good cross-validation for SOI, shown in the figure below.

  1. Xiaoqun, C. et al. ENSO prediction based on Long Short-Term Memory (LSTM). IOP Conference Series: Materials Science and Engineering, 799, 012035 (2020).

The x-axis appears to be in months and likely starts in 1979, so it captures the 2016 El Nino (El Nino is negative for SOI). Still have no idea how the neural net arrived at the fit other than it being able to discern the cyclic behavior from the historical waveform between 1979 and 2010. From the article itself, it appears that neither do the authors.

Continue reading

Combinatorial Tidal Constituents

For the tidal forcing that contributes to length-of-day (LOD) variations [1], only a few factors contribute to a plurality of the variation. These are indicated below by the highlighted circles, where the V0/g amplitude is greatest. The first is the nodal 18.6 year cycle, indicated by the N’ = 1 Doodson argument. The second is the 27.55 day “Mm” anomalistic cycle which is a combination of the perigean 8.85 year cycle (p = -1 Doodson argument) mixed with the 27.32 day tropical cycle (s=1 Doodson argument). The third and strongest is twice the tropical cycle (therefore s=2) nicknamed “Mf”.

Tidal Constituents contributing to dLOD from R.D. Ray [1]

These three factors also combine as the primary input forcing to the ENSO model. Yet, even though they are strongest, the combinatorial factors derived from multiplying these main harmonics are vital for generating a quality fit (both for dLOD and even more so for ENSO). What I have done in the past was apply the recommended mix of first- and second-order factors that appear in the dLOD spectra for the ENSO forcing.

Yet there is another approach that makes no assumption of the strongest 2nd-order factors. In this case, one simply expands the primary factors as a combinatorial expansion of cross-terms to the 4th level — this then generates a broad mix of monthly, fortnightly, 9-day, and weekly harmonic cycles. A nested algorithm to generate the 35 constituent terms is :

(1+s+p+N')^4

Counter := 1;
for J in Constituents'Range loop
 for K in Constituents'First .. J loop 
  for L in Constituents'First .. K loop 
   for M in Constituents'First .. L loop 
      Tf := Tf + Coefficients (Counter) * Fundamental(J) * 
            Fundamental(K) * Fundamental(L) * Fundamental (M); 
      Counter := Counter + 1; 
   end loop; 
  end loop; 
 end loop; 
end loop;

This algorithm requires the three fundamental terms plus one unity term to capture most of the cross-terms shown in Table 3 above (The annual cross-terms are automatic as those are generated by the model’s annual impulse). This transforms into a coefficients array that can be included in the LTE search software.

What is missing from the list are the evection terms corresponding to 31.812 (Msm) and 27.093 day cycles. They are retrograde to the prograde 27.55 day anomalistic cycle, so would need an additional 8.848 year perigee cycle bring the count from 3 fundamental terms to 4.

The difference between adding an extra level of harmonics, bringing the combinatorial total from 35 to 126, is not very apparent when looking at the time series (below), as it simply adds shape to the main fortnightly tropical cycle.

Yet it has a significant effect on the ENSO fit, approaching a CC of 0.95 (see inset at right for the scatter correlation). Note that the forcing frequency spectra in the middle right inset still shows a predominately tropical fortnightly peak at 0.26/yr and 0.74/yr.

These extra harmonics also helps in matching to the much more busy SOI time-series. Click on the chart below to inspect how the higher-K wavenumbers may be the origin of what is thought to be noise in the SOI measurements.

Is this a case of overfitting? Try the following cross-validation on orthogonal intervals, and note how tight the model matches the data to the training intervals, without degrading too much on the outer validation region.

I will likely add this combinatorial expansion approach to the LTE fitting software on GitHub soon, but thought to checkpoint the interim progress on the blog. In the end the likely modeling mix will be a combination of the geophysical calibration to the known dLOD response together with a refined collection of these 2nd-order combinatorial tidal constituents. The rationale for why certain terms are important will eventually become more clear as well.

References

  1. Ray, R.D. and Erofeeva, S.Y., 2014. Long‐period tidal variations in the length of day. Journal of Geophysical Research: Solid Earth119(2), pp.1498-1509.

El Nino Modoki

El Nino Modoki Possible To Increase Winter Snow Chances – Just In ...
Tri-lobes of Modoki

In Chapter 12 of the book we model — via LTE — the canonical El Nino Southern Oscillation (ENSO) behavior, fitting to closely-correlated indices such as NINO3.4 and SOI. Another El Nino index was identified circa 2007 that is not closely correlated to the well-known ENSO indices. This index, referred to as El Nino Modoki, appears to have more of a Pacific Ocean centered dipole shape with a bulge flanked by two wing lobes, cycling again as an erratic standing-wave.

If in fact Modoki differs from the conventional ENSO only by a different standing-wave wavenumber configuration, then it should be straightforward to model as an LTE variation of ENSO. The figure below is the model fitted to the El Nino Modoki Index (EMI) (data from JAMSTEC). The cross-validation is included as values post-1940 were used in the training with values prior to this used as a validation test.

The LTE modulation has a higher fundamental wavenumber component than ENSO (plus a weaker factor closer to a zero wavenumber, i.e. some limited LTE modulation as is found with the QBO model).

The input tidal forcing is close to that used for ENSO but appears to lead it by one year. The same strength ordering of tidal factors occurs, but with the next higher harmonic (7-day) of the tropical fortnightly 13.66 day tide slightly higher for EMI than ENSO.

The model fit is essentially a perturbation of ENSO so did not take long to optimize based on the Laplace’s Tidal Equation modeling software. I was provoked to run the optimization after finding a paper yesterday on using machine learning to model El Nino Modoki [1].

It’s clear that what needs to be done is a complete spatio-temporal model fit across the equatorial Pacific, which will be amazing as it will account for the complete mix of spatial standing-wave modes. Maybe in a couple of years the climate science establishment will catch up.

References

[1] Pal, Manali, et al. “Long-Lead Prediction of ENSO Modoki Index Using Machine Learning Algorithms.” Scientific Reports, vol. 10, 2020, doi:10.1038/s41598-019-57183-3.

MJO

The Madden-Julian Oscillation (MJO) is a climate index that captures tropical variability at a finer resolution (i.e. intra-annual) than the (inter-annual) ENSO index over approximately the same geographic region.  Since much of the MJO variability is observed as 30 to 60 day cycles (and these are traveling waves, not standing waves), providing MJO data as a monthly time-series will filter out the fast cycles. Still, it is interesting to analyze the monthly MJO data and compare/contrast that to ENSO. As a disclaimer, it is known that inter-annual variability of the MJO is partly linked to ENSO, but the following will clearly show that connection.

This is the fit of MJO (longitude index #1) using the ENSO model as a starting point (either the NINO34 or SOI works equally well).

The constituent temporal forcing factors for MJO and ENSO align precisely

This is not surprising because the monthly filtered MJO does show the same El Nino peaks at 1983, 1998, and 2016 as the ENSO time-series. The only difference is in the LTE spatial modulation applied during the fitting process, whereby the MJO has a stronger high-wavenumber factor than the ENSO time series.

This is the SOI fit over the same 1980+ interval as MJO, with an almost 0.6 correlation.

 

AO

The Arctic Oscillation (AO) dipole has behavior that is correlated to the North Atlantic Oscillation (NAO) dipole.   We can see this in two ways. First, and most straight-forwardly, the correlation coefficient between the AO and NAO time-series is above 0.6.

Secondly, we can use the model of the NAO from the last post and refit the parameters to the AO data (data also here), but spanning an orthogonal interval. Then we can compare the constituent lunisolar factors for NAO and AO for correlation, and further discover that this also doubles as an effective cross-validation for the underlying LTE model (as the intervals are orthogonal).

Top panel is a model fit for AO between 1900-1950, and below that is a model fit for NAO between 1950-present. The lower pane is the correlation for a common interval (left) and for the constituent lunisolar factors for the orthogonal interval (right)

Only the anomalistic factor shows an imperfect correlation, and that remains quite high.

NAO

The challenge of validating the models of climate oscillations such as ENSO and QBO, rests primarily in our inability to perform controlled experiments. Because of this shortcoming, we can either do (1) predictions of future behavior and validate via the wait-and-see process, or (2) creatively apply techniques such as cross-validation on currently available data. The first is a non-starter because it’s obviously pointless to wait decades for validation results to confirm a model, when it’s entirely possible to do something today via the second approach.

There are a variety of ways to perform model cross-validation on measured data.

In its original and conventional formulation, cross-validation works by checking one interval of time-series against another, typically by training on one interval and then validating on an orthogonal interval.

Another way to cross-validate is to compare two sets of time-series data collected on behaviors that are potentially related. For example, in the case of ocean tidal data that can be collected and compared across spatially separated geographic regions, the sea-level-height (SLH) time-series data will not necessarily be correlated, but the underlying lunar and solar forcing factors will be closely aligned give or take a phase factor. This is intuitively understandable since the two locations share a common-mode signal forcing due to the gravitational pull of the moon and sun, with the differences in response due to the geographic location and local spatial topology and boundary conditions. For tides, this is a consensus understanding and tidal prediction algorithms have stood the test of time.

In the previous post, cross-validation on distinct data sets was evaluated assuming common-mode lunisolar forcing. One cross-validation was done between the ENSO time-series and the AMO time-series. Another cross-validation was performed for ENSO against PDO. The underlying common-mode lunisolar forcings were highly correlated as shown in the featured figure.  The LTE spatial wave-number weightings were the primary discriminator for the model fit. This model is described in detail in the book Mathematical GeoEnergy to be published at the end of the year by Wiley.

Another common-mode cross-validation possible is between ENSO and QBO, but in this case it is primarily in the Draconic nodal lunar factor — the cyclic forcing that appears to govern the regular oscillations of QBO.  Below is the Draconic constituent comparison for QBO and the ENSO.

The QBO and ENSO models only show a common-mode correlated response with respect to the Draconic forcing. The Draconic forcing drives the quasi-periodicity of the QBO cycles, as can be seen in the lower right panel, with a small training window.

This cross-correlation technique can be extended to what appears to be an extremely erratic measure, the North Atlantic Oscillation (NAO).

Like the SOI measure for ENSO, the NAO is originally derived from a pressure dipole measured at two separate locations — but in this case north of the equator.  From the high-frequency of the oscillations, a good assumption is that the spatial wavenumber factors are much higher than is required to fit ENSO. And that was the case as evidenced by the figure below.

ENSO vs NAO cross-validation

Both SOI and NAO are noisy time-series with the NAO appearing very noisy, yet the lunisolar constituent forcings are highly synchronized as shown by correlations in the lower pane. In particular, summing the Anomalistic and Solar constituent factors together improves the correlation markedly, which is because each of those has influence on the other via the lunar-solar mutual gravitational attraction. The iterative fitting process adjusts each of the factors independently, yet the net result compensates the counteracting amplitudes so the net common-mode factor is essentially the same for ENSO and NAO (see lower-right correlation labelled Anomalistic+Solar).

Since the NAO has high-frequency components, we can also perform a conventional cross-validation across orthogonal intervals. The validation interval below is for the years between 1960 and 1990, and even though the training intervals were aggressively over-fit, the correlation between the model and data is still visible in those 30 years.

NAO model fit with validation spanning 1960 to 1990

Over the course of time spent modeling ENSO, the effort that went into fitting to NAO was a fraction of the original time. This is largely due to the fact that the temporal lunisolar forcing only needed to be tweaked to match other climate indices, and the iteration over the topological spatial factors quickly converges.

Many more cross-validation techniques are available for NAO, since there are different flavors of NAO indices available corresponding to different Atlantic locations, and spanning back to the 1800’s.