Cross-Validation of ENSO

Experimenting with linking to slide presentations instead of a trad blog post. The PDF linked below is an eye-opener as the NINO34 fit is the most parsimonious ever, at the expense of a higher LTE modulation (explained here). The cross-validation involves far fewer tidal factors than dealt with earlier, the two factors used (Mf and Mm tidal factors) rivaling the one factor used in QBO (described here).

Continue reading

The harmonics generator of the ocean

The research category is topological considerations of Laplace’s Tidal Equations (LTE = a shallow-water simplification of Navier-Stokes) applied to the equatorial thermocline — the following citations provides an evolutionary understanding that I have developed via presentations and publications over the last 6 years (working backwards)

“Nonlinear long-period tidal forcing with application to ENSO, QBO, and Chandler wobble”, EGU General Assembly Conference Abstracts, 2021, EGU21-10515

“Nonlinear Differential Equations with External Forcing”, ICLR 2020 Workshop DeepDiffEq

“Mathematical Geoenergy: Discovery, Depletion, and Renewal”, John Wiley & Sons, 2019, chapter 12: “Wave Energy”

“Ephemeris calibration of Laplace’s tidal equation model for ENSO”, AGU Fall Meeting 2018,

“Biennial-Aligned Lunisolar-Forcing of ENSO: Implications for Simplified Climate Models”,
AGU Fall Meeting 2017,

“Analytical Formulation of Equatorial Standing Wave Phenomena: Application to QBO and ENSO”, AGU Fall Meeting Abstracts 2016, OS11B-04,

Given that I have worked on this topic persistently over this span of time, I have gained considerable insight into how straightforward it has become to generate relatively decent fits to climate dipoles such as ENSO. Paradoxically, this is both good and bad. It’s good because the model’s recipe is algorithmically simply described in terms of plausibility and parsimony. That’s largely because it’s a straightforward non-linear extension of a conventional tidal analysis model. However that non-linearity opens up the possibility for many similar model fits that are equally good, yet difficult to discriminate between. So it’s bad in the sense that I can come to an impasse in selecting the “best” model.

This is oversimplifying a bit but the framing issue is if you knew the answer was 72, but have a hard time determining whether the question being posed was one of 2×36, 3×24, 4×18, 6×12, 8×9, 9×8, 12×6, 18×4, 24×3, or 36×2. Each gives the right answer, but potentially not the right mechanism. This is a fundamental problem with non-linear analysis.

A conventional tidal analysis by itself is just a few fundamental tidal factors (exactly 4) but made devastatingly accurate by the introduction of 2nd-order harmonics and cross-harmonics. All these harmonics are generated by non-linear effects but the frequency spectrum is so clean and distinct for a sea-level-height (SLH) time-series that the equivalent solution to k × F = 72 is essentially a scaling identification problem where the k is the scale factor for the corresponding cyclic tidal factors F.

Yet, by applying the non-linear LTE solution to the problem of modeling ENSO, we quickly discover that the algorithm is a wickedly effective harmonics and cross-harmonics generator. Any number of combinations of harmonics can develop an adequate fit depending on the variable LTE modulation applied. So it could be a small LTE modulation mixed with a wide assortment of tidal factors (the 2×36 case) or it could be a large LTE modulation mixed with a minimum of tidal factors (the 18×4 case). Or it could be something in between (e.g. the 8×9 case). This is all a result of the sine-of-a-sine non-linearity of the LTE formulation, related to the Mach-Zehnder modulation used in optical cryptography applications. The latter bit is the hint that things may not be unambiguously decoded given the fact that M-Z has been discovered to be nature’s own built-in encryption device.

However, there remains lots of light at the end of this tunnel, as I have also discovered that the tidal factor spread is likely largely governed by a single lunar tidal constituent, the 27.55 day anomalistic Mm cycle interfering with an annual impulse. That’s essentially 2 of the 4 tidal factors, with the other 2 lunar factors providing a 2nd-order correction. For the longest time I had been focused on the 13.66 day tropical Mf cycle as that also lead to a decent fit over the years, specifically since the first beat harmonic of the Mf cycle with the annual impulse is 3.8 years while the Mm cycle is 3.9 years. These two terms are close enough that they only go out-of-phase after ~130 years, which is the extent of the ENSO time-series. Only when you try to simplify a model fit by iterating over the space of factor combinations will you discover the difference between 3.8 and 3.9.

In terms of geophysics, the Mf factor is a tractional tidal forcing operating parallel to the ocean’s surface influenced by the moon’s latitudinal declination, while the Mm factor is a largely perpendicular gravitational forcing influenced by the perigean cycle of the Moon-to-Earth distance. The latter may be the “right mechanism” as each can give close to the “right answer”.

So the gist of the fitting observations is that far fewer harmonic factors are required for a decent Mm-based model than for a Mf-based model. This is slightly at the expense of a stronger LTE modulation, but the parsimony of an Mm-based model can’t be beat, as I will show via the following analysis charts…

This is a good model fit based on a slightly modified Mm-based factorization, with a sample-and-hold placed on a strong annual impulse

The comparison of the modified Mm tidal factorization to the pure Mm is below (the reason the 27.55 day periodicity doesn’t appear is because of the monthly aliasing used in plotting).

The slight pattern on top of pure signal is due to a 6-year beat of the Mm period with the 27.212 day lunar draconic pattern indicating the time between equatorial nodal crossings of the moon. This is the strongest factor of the ascension cycle described in the solar and lunar ephemeris published recently by Sung-Ho Na. As highlighted above by numbered cycles, ~20 occur in the span of 120 years.

Below, an expanded look showing how slight a correction is applied

The integrated forcing after the annual impulse is shown below. The sample-and-hold integration exaggerates low-frequency differences so the distinction between the pure Mm forcing and Mm+harmonics is more apparent. The 6-year periodicity is obscured by longer term variations.

The log-scaled power spectra of the integrated tidal forcing is shown below. Note the overwhelmingly strong peak near the 0.25/year frequency (3.9 year cycle). The rest of the peaks are readily matched to periodicities in the Na ephemerides [1] according to their strength.

The LTE modulation is quite strong for this factorization. As shown below, the forcing levels need to sinusoidally fold several times over to match the observed ENSO behavior. See the recent post Inverting non-autonomous functions for a recipe to aid in iterating for the underlying LTE modulation.

The parsimony of this model can’t be emphasized enough. It’s equivalent to the agreement of a conventional tidal forcing analysis to a SLH time-series in that only a single lunar tidal factor accounts for a majority of the modulation. Only the challenge of finding the correct LTE modulation stands in the way of producing an unambiguously correct model for the underlying ENSO behavioral dynamics.


[1] Sung-Ho Na, Chapter 19 – Prediction of Earth tide, Editor(s): Pijush Samui, Barnali Dixon, Dieu Tien Bui, Basics of Computational Geophysics, Elsevier, 2021, Pages 351-372, ISBN 9780128205136, (note: the ephemerides for the Earth-Moon-Sun system matches closely the online NASA JPL ephemerides generator available at, but this paper is more useful in that it algorithmically states the contributions of the various tidal factors in the source code supplied. Source code also available at

Combinatorial Tidal Constituents

For the tidal forcing that contributes to length-of-day (LOD) variations [1], only a few factors contribute to a plurality of the variation. These are indicated below by the highlighted circles, where the V0/g amplitude is greatest. The first is the nodal 18.6 year cycle, indicated by the N’ = 1 Doodson argument. The second is the 27.55 day “Mm” anomalistic cycle which is a combination of the perigean 8.85 year cycle (p = -1 Doodson argument) mixed with the 27.32 day tropical cycle (s=1 Doodson argument). The third and strongest is twice the tropical cycle (therefore s=2) nicknamed “Mf”.

Tidal Constituents contributing to dLOD from R.D. Ray [1]

These three factors also combine as the primary input forcing to the ENSO model. Yet, even though they are strongest, the combinatorial factors derived from multiplying these main harmonics are vital for generating a quality fit (both for dLOD and even more so for ENSO). What I have done in the past was apply the recommended mix of first- and second-order factors that appear in the dLOD spectra for the ENSO forcing.

Yet there is another approach that makes no assumption of the strongest 2nd-order factors. In this case, one simply expands the primary factors as a combinatorial expansion of cross-terms to the 4th level — this then generates a broad mix of monthly, fortnightly, 9-day, and weekly harmonic cycles. A nested algorithm to generate the 35 constituent terms is :


Counter := 1;
for J in Constituents'Range loop
 for K in Constituents'First .. J loop 
  for L in Constituents'First .. K loop 
   for M in Constituents'First .. L loop 
      Tf := Tf + Coefficients (Counter) * Fundamental(J) * 
            Fundamental(K) * Fundamental(L) * Fundamental (M); 
      Counter := Counter + 1; 
   end loop; 
  end loop; 
 end loop; 
end loop;

This algorithm requires the three fundamental terms plus one unity term to capture most of the cross-terms shown in Table 3 above (The annual cross-terms are automatic as those are generated by the model’s annual impulse). This transforms into a coefficients array that can be included in the LTE search software.

What is missing from the list are the evection terms corresponding to 31.812 (Msm) and 27.093 day cycles. They are retrograde to the prograde 27.55 day anomalistic cycle, so would need an additional 8.848 year perigee cycle bring the count from 3 fundamental terms to 4.

The difference between adding an extra level of harmonics, bringing the combinatorial total from 35 to 126, is not very apparent when looking at the time series (below), as it simply adds shape to the main fortnightly tropical cycle.

Yet it has a significant effect on the ENSO fit, approaching a CC of 0.95 (see inset at right for the scatter correlation). Note that the forcing frequency spectra in the middle right inset still shows a predominately tropical fortnightly peak at 0.26/yr and 0.74/yr.

These extra harmonics also helps in matching to the much more busy SOI time-series. Click on the chart below to inspect how the higher-K wavenumbers may be the origin of what is thought to be noise in the SOI measurements.

Is this a case of overfitting? Try the following cross-validation on orthogonal intervals, and note how tight the model matches the data to the training intervals, without degrading too much on the outer validation region.

I will likely add this combinatorial expansion approach to the LTE fitting software on GitHub soon, but thought to checkpoint the interim progress on the blog. In the end the likely modeling mix will be a combination of the geophysical calibration to the known dLOD response together with a refined collection of these 2nd-order combinatorial tidal constituents. The rationale for why certain terms are important will eventually become more clear as well.


  1. Ray, R.D. and Erofeeva, S.Y., 2014. Long‐period tidal variations in the length of day. Journal of Geophysical Research: Solid Earth119(2), pp.1498-1509.

Length of Day II

This is a continuation from the previous Length of Day post showing how closely the ENSO forcing aligns to the dLOD forcing.

Ding & Chao apply an AR-z technique as a supplement to Fourier and Max Entropy spectral techniques to isolate the tidal factors in dLOD

The red data points are the spectral values used in the ENSO model fit.

The top panel below is the LTE modulated tidal forcing fitted against the ENSO time series. The lower panel below is the tidal forcing model over a short interval overlaid on the dLOD/dt data.

That’s all there is to it — it’s all geophysical fluid dynamics. Essentially the same tidal forcing impacts both the rotating solid earth and the equatorial ocean, but the ocean shows a lagged nonlinear response as described in Chapter 12 of the book. In contrast, the solid earth shows an apparently direct linear inertial response. Bottom line is that if one doesn’t know how to do the proper GFD, one will never be able to fit ENSO to a known forcing.


In Chapter 12 of the book, we focused on modeling the standing-wave behavior of the Pacific ocean dipole referred to as ENSO (El Nino /Southern Oscillation).  Because it has been in climate news recently, it makes sense to give equal time to the Atlantic ocean equivalent to ENSO referred to as the Atlantic Multidecadal Oscillation (AMO). The original rationale for modeling AMO was to determine if it would help cross-validate the LTE theory for equatorial climate dipoles such as ENSO; this was reported at the 2018 Fall Meeting of the AGU (poster). The approach was similar to that applied for other dipoles such as the IOD (which is also in the news recently with respect to Australia bush fires and in how multiple dipoles can amplify climate extremes [1]) — and so if we can apply an identical forcing for AMO as for ENSO then we can further cross-validate the LTE model. So by reusing that same forcing for an independent climate index such as AMO, we essentially remove a large number of degrees of freedom from the model and thus defend against claims of over-fitting.

Continue reading