Low #DOF ENSO Model

Given two models of a physical behavior, the “better” model has the highest correlation (or lowest error) to the data and the lowest number of degrees of freedom (#DOF) in terms of tunable parameters. This ratio CC/#DOF of correlation coefficient over DOF is routinely used in automated symbolic regression algorithms and for scoring of online programming contests. A balance between a good error metric and a low complexity score is often referred to as a Pareto frontier.

So for modeling ENSO, the challenge is to fit the quasi-periodic NINO34 time-series with a minimal number of tunable parameters. For a 140 year fitting interval (1880-1920), a naive Fourier series fit could easily take 50-100 sine waves of varying frequencies, amplitudes, and phase to match a low-pass filtered version of the data (any high-frequency components may take many more). However that is horribly complex model and obviously prone to over-fitting. Obviously we need to apply some physics to reduce the #DOF.

Since we know that ENSO is essentially a model of equatorial fluid dynamics in response to a tidal forcing, all that is needed is the gravitational potential along the equator. The paper by Na [1] has software for computing the orbital dynamics of the moon (i.e. lunar ephemerides) and a 1st-order approximation for tidal potential:

The software contains well over 100 sinusoidal terms (each consisting of amplitude, frequency, and phase) to internally model the lunar orbit precisely. Thus, that many DOF are removed, with a corresponding huge reduction in complexity score for any reasonable fit. So instead of a huge set of factors to manipulate (as with many detailed harmonic tidal analyses), what one is given is a range (r = R) and a declination ( ψ=delta) time-series. These are combined in a manner following the figure from Na shown above, essentially adjusting the amplitudes of R and delta while introducing an additional tangential or tractional projection of delta (sin instead of cos). The latter is important as described in NOAA’s tide producing forces page.

Although I roughly calibrated this earlier [2] via NASA’s HORIZONS ephemerides page (input parameters shown on the right), the Na software allows better flexibility in use. The two calculations essentially give identical outputs and independent verification that the numbers are as expected.

As this post is already getting too long, this is the result of doing a Laplace’s Tidal Equation fit (adding a few more DOF), demonstrating that the limited #DOF prevents over-fitting on a short training interval while cross-validating outside of this band.

or this

This low complexity and high accuracy solution would win ANY competition, including the competition for best seasonal prediction with a measly prize of 15,000 Swiss francs [3]. A good ENSO model is worth billions of $$ given the amount it will save in agricultural planning and its potential for mitigation of human suffering in predicting the timing of climate extremes.


[1] Na, S.-H. Chapter 19 – Prediction of Earth tide. in Basics of Computational Geophysics (eds. Samui, P., Dixon, B. & Tien Bui, D.) 351–372 (Elsevier, 2021). doi:10.1016/B978-0-12-820513-6.00022-9.

[2] Pukite, P.R. et al “Ephemeris calibration of Laplace’s tidal equation model for ENSO” AGU Fall Meeting, 2018. doi:10.1002/essoar.10500568.1

[3] 1 CHF ~ $1 so 15K = chump change.

Added: High resolution power spectra of ENSO forcing
see link

4 thoughts on “Low #DOF ENSO Model

  1. For the power spectrum of input tidal forcing shown at the end:

    Strengths ranked following Ray’s LOD estimates (from https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2013JB010830).
    The numbered factors are ranked fundamental/primary and the ranked ABCD are secondary compound factors.

    Interesting that the #6 & #7 nodal (18.6y period) satellites around Mm are not nearly as strong as predicted. So if we add that modulation, the fit is much improved. I rather believe this is an oversight in the Na software and in the NASA JPL Horizons page.


    • Note what Ray says:

      “This is extremely relevant for any work that aims to develop models for other long-period constituents, since only Mf (and perhaps Mm and Mt) is large enough to be measured with much fidelity in ocean data. For all other constituents we must rely on numerical ocean models that do not assimilate data and hope that our experience with Mf holds when extended to these other frequencies. LOD measurements themselves may well provide the best test of this approach.”

      One of the Mm satellites shows up in the LOD data, the 27.67d line next to Mm:

      Which indicates it should be added.

      “At the longest periods of Table 3, and most notably at 18.6 years, there appears little that observed LOD data can reveal about the reliability of our model. The amplitude of ΔΛ at 18.6years is 160 μs, with a small out-of-phase component of 1.6 μs. The amplitude in the 2010 IERS model [Petit and Luzum, 2010, Table 8.1] is 149.5 μs. As Figure 5 makes clear, the 18.6 year tidal signal, and certainly the 10 μs difference between the two models, is buried in the observed LOD data by decadal, nontidal variability, presumably from processes in the Earth’s core. As emphasized in section 3 our model at 18.6 years is based on extrapolation from much higher frequencies and is surely in some error. Unfortunately, observations of LOD are unlikely to reveal the magnitude of that error until the time series is considerably longer.

      Note that the ENSO time series is considerably longer so these long-period terms are the most uncertain and so available for modification.

      Also this paper on improved lunar ephemeris

      ” improves the LLR post-fit residuals by few mm (3-4 mm) visible more prominently over years 2010 to 2019, due to the higher accuracy of LLR data acquired during this period. The annual term (l’) has the most prominence among the three fitted amplitudes. Their values are tabulated in Table 7. We adapt this method for providing users with high accuracy requirements, until an equivalent dynamical model representation is introduced. Amplitudes of terms with longer periods (≥ 3yr) could in principle be fit at the current accuracy and long baseline of the data, but does not contribute to a significant improvement in the post-fit residuals.”


  2. https://www.sciencedirect.com/science/article/pii/S1674984718301009
    “Earth rotation deceleration/acceleration due to semidiurnal oceanic/atmospheric tides: Revisited with new calculation” by Na et al 2019

    Minor amplitude factor:

    “The factor 1+k-l should be multiplied to correctly represent the tangential component of tidal force [11]. We are convinced that this factor has been neglected or misinterpreted by all of the previous scholars.”


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s