The objective is to apply negative entropy to find an optimal solution to a deterministically ordered pattern. To start, let us contrast the behavior of autonomous vs non-autonomous differential equations. One way to think about the distinction is that the transfer function for non-autonomous only depends on the presenting input. Thus, it acts like an op-amp with infinite bandwidth. Or below saturation it gives perfectly linear amplification, so that as shown on the graph to the right, the x-axis input produces an amplified y-axis output as long as the input is within reasonable limits.Continue reading
Given two models of a physical behavior, the “better” model has the highest correlation (or lowest error) to the data and the lowest number of degrees of freedom (#DOF) in terms of tunable parameters. This ratio CC/#DOF of correlation coefficient over DOF is routinely used in automated symbolic regression algorithms and for scoring of online programming contests. A balance between a good error metric and a low complexity score is often referred to as a Pareto frontier.
So for modeling ENSO, the challenge is to fit the quasi-periodic NINO34 time-series with a minimal number of tunable parameters. For a 140 year fitting interval (1880-1920), a naive Fourier series fit could easily take 50-100 sine waves of varying frequencies, amplitudes, and phase to match a low-pass filtered version of the data (any high-frequency components may take many more). However that is horribly complex model and obviously prone to over-fitting. Obviously we need to apply some physics to reduce the #DOF.
Since we know that ENSO is essentially a model of equatorial fluid dynamics in response to a tidal forcing, all that is needed is the gravitational potential along the equator. The paper by Na  has software for computing the orbital dynamics of the moon (i.e. lunar ephemerides) and a 1st-order approximation for tidal potential:
The software contains well over 100 sinusoidal terms (each consisting of amplitude, frequency, and phase) to internally model the lunar orbit precisely. Thus, that many DOF are removed, with a corresponding huge reduction in complexity score for any reasonable fit. So instead of a huge set of factors to manipulate (as with many detailed harmonic tidal analyses), what one is given is a range (r = R) and a declination ( ψ=delta) time-series. These are combined in a manner following the figure from Na shown above, essentially adjusting the amplitudes of R and delta while introducing an additional tangential or tractional projection of delta (sin instead of cos). The latter is important as described in NOAA’s tide producing forces page.
Although I roughly calibrated this earlier  via NASA’s HORIZONS ephemerides page (input parameters shown on the right), the Na software allows better flexibility in use. The two calculations essentially give identical outputs and independent verification that the numbers are as expected.
As this post is already getting too long, this is the result of doing a Laplace’s Tidal Equation fit (adding a few more DOF), demonstrating that the limited #DOF prevents over-fitting on a short training interval while cross-validating outside of this band.
This low complexity and high accuracy solution would win ANY competition, including the competition for best seasonal prediction with a measly prize of 15,000 Swiss francs . A good ENSO model is worth billions of $$ given the amount it will save in agricultural planning and its potential for mitigation of human suffering in predicting the timing of climate extremes.
 Na, S.-H. Chapter 19 – Prediction of Earth tide. in Basics of Computational Geophysics (eds. Samui, P., Dixon, B. & Tien Bui, D.) 351–372 (Elsevier, 2021). doi:10.1016/B978-0-12-820513-6.00022-9.
 Pukite, P.R. et al “Ephemeris calibration of Laplace’s tidal equation model for ENSO” AGU Fall Meeting, 2018. doi:10.1002/essoar.10500568.1
 1 CHF ~ $1 so 15K = chump change.
The equatorial zone acts as a waveguide. As highlights they list the following bullet-points, taking advantage that the Coriolis effect at the equator vanishes or cancels.
This is a critical assertion, since — as shown in Mathematical Geoenergy –the Chandler wobble (a nutational oscillation) is forced by tides, then transitively so is the El Nino. So when the authors state the consequence is of both nutation and a gravity influence, it is actually the gravity influence of the moon and sun (and slightly Jupiter) that is the root cause.
The article has several equations that claim analytical solutions, but the generated PDF format has apparently not rendered the markup correctly. Many “+” signs are missing from equations. I have seen this issue before when I have tried to generate PDF pages from a markup doc, and assume that is what is happening. Assume the hard-copy version is OK so may have to go to the library to retrieve it, or perhaps ask the authors for a hard-copy.
Sergey А. Arsen’yev
Dept. of Earth and Planetary Physics of Schmidt’s Institute of the Earth’s Physics, Russian Academy of Sciences, 10 Bolshaya Gruzinskaya, Moscow, 123995, Russia
Something I learned early on in my research career is that complicated frequency spectra can be generated from simple repeating structures. Consider the spatial frequency spectra produced as a diffraction pattern produced from a crystal lattice. Below is a reflected electron diffraction pattern of a reconstructed hexagonally reconstructed surface of a silicon (Si) single crystal with a lead (Pb) adlayer ( (a) and (b) are different alignments of the beam direction with respect to the lattice). Suffice to say, there is enough information in the patterns to be able to reverse engineer the structure of the surface as (c).
Now consider the ENSO pattern. At first glance, neither the time-series signal nor the Fourier series power spectra appear to be produced by anything periodically regular. Even so, let’s assume that the underlying pattern is tidally regular, being comprised of the expected fortnightly 13.66 day tropical/synodic cycle and the monthly 27.55 day anomalistic cycle synchronized by an annual impulse. Then the forcing power spectrum of f(t) looks like the RED trace on the left-side of the figure below, F(ω). Clearly that is not enough of a frequency spectra (a few delta spikes) necessary to make up the empirically calculated Fourier series for the ENSO data comprising ~40 intricately placed peaks between 0 and 1 cycles/year in BLUE.
Yet, if we modulate that with an Laplace’s Tidal Equation solution functional g(f(t)) that has a G(ω) as in the yellow inset above — a cyclic modulation of amplitudes where g(x) is described by two distinct sine-waves — then the complete ENSO spectra is fleshed out in BLACK in the figure above. The effective g(x) is shown in the figure below, where a slower modulation is superimposed over a faster modulation.
So essentially what this is suggesting is that a few tidal factors modulated by two sinusoids produces enough spectral detail to easily account for the ~40 peaks in the ENSO power spectra. It can do this because a modulating sinusoid is an efficient harmonics and cross-harmonics generator, as the Taylor’s series of a sinusoid contains an effectively infinite number of power terms.
To see this process in action, consider the following three figures, which features a slider that allows one to get an intuitive feel for how the LTE modulation adds richness via harmonics in the power spectra.
- Start with a mild LTE modulation and start to increase it as in the figure below. A few harmonics begin to emerge as satellites surrounding the forcing harmonics in RED.
2. Next, increase the LTE modulation so that it models the slower sinusoid — more harmonics emerge
3. Then add the faster sinusoid, to fully populate the empirically observed ENSO spectral peaks (and matching the time series).
It appears as if by magic, but this is the power of non-linear harmonic generation. Note that the peak labeled AB amongst others is derived from the original A and B as complicated satellite-cross terms, which can be accounted for by expanding all of the terms in the Taylor’s series of the sinusoids. This can be done with some difficulty, or left as is when doing the fit via solver software.
To complete the circle, it’s likely that being exposed to mind-blowing Fourier series early on makes Fourier analysis of climate data less intimidating, as one can apply all the tricks-of-the-trade, which, alas, are considered routine in other disciplines.
I presented at the 2018 AGU Fall meeting on the topic of cross-validation. From those early results, I updated a fitted model comparison between the Pacific ocean’s ENSO time-series and the Atlantic Ocean’s AMO time-series. The premise is that the tidal forcing is essentially the same in the two oceans, but that the standing-wave configuration differs. So the approach is to maintain a common-mode forcing in the two basins while only adjusting the Laplace’s tidal equation (LTE) modulation.
If you don’t know about these completely orthogonal time series, the thought that one can avoid overfitting the data — let alone two sets simultaneously — is unheard of (Michael Mann doesn’t even think that the AMO is a real oscillation based on reading his latest research article called “Absence of internal multidecadal and interdecadal oscillations in climate model simulations“).
This is the latest product (click to expand)
Read this backwards from H to A.
H = The two tidal forcing inputs for ENSO and AMO — differs really only by scale and a slight offset
G = The constituent tidal forcing spectrum comparison of the two — primarily the expected main constituents of the Mf fortnightly tide and the Mm monthly tide (and the Mt composite of Mf × Mm), amplified by an annual impulse train which creates a repeating Brillouin zone in frequency space.
E&F = The LTE modulation for AMO, essentially comprised of one strong high-wavenumber modulation as shown in F
C&D = The LTE modulation for ENSO, a strong low-wavenumber that follows the El Nino La Nina cycles and then a faster modulation
B = The AMO fitted model modulating H with E
A = The ENSO fitted model modulating the other H with C
Ordinarily, this would take eons worth of machine learning compute time to determine this non-linear mapping, but with knowledge of how to solve Navier-Stokes, it becomes a tractable problem.
Now, with that said, what does this have to do with cross-validation? By fitting only to the ENSO time-series, the model produced does indeed have many degrees of freedom (DOF), based on the number of tidal constituents shown in G. Yet, by constraining the AMO fit to require essentially the same constituent tidal forcing as for ENSO, the number of additional DOF introduced is minimal — note the strong spike value in F.
Since parsimony of a model fit is based on information criteria such as number of DOF, as that is exactly what is used as a metric characterizing order in the previous post, then it would be reasonable to assume that fitting a waveform as complex as B with only the additional information of F cross-validates the underlying common-mode model according to any information criteria metric.
For further guidance, this is an informative article on model selection in regards to complexity — “A Primer for Model Selection: The Decisive Role of Model Complexity“
For the LTE formulation along the equator, the analytical solution reduces to g(f(t)), where g(x) is a periodic function. Without knowing what g(x) is, we can use the frequency-domain entropy or spectral entropy of the Fourier series mapping an estimated x=f(t) forcing amplitude to a measured climate index time series such as ENSO. The frequency-domain entropy is the sum or integral of this mapping of x to g(x) in reciprocal space applying the Shannon entropy –I(f).ln(I(f)) normalized over the I(f) frequency range, which is the power spectral (frequency) density of the mapping from the modeled forcing to the time-series waveform sample.
This measures the entropy or degree of disorder of the mapping. So to maximize the degree of order, we minimize this entropy value.
This calculated entropy is a single scalar metric that eliminates the need for evaluating various cyclic g(x) patterns to achieve the best fit. Instead, what it does is point to a highly-ordered spectrum (top panel in the above figure), of which the delta spikes can then be reverse engineered to deduce the primary frequency components arising from the the LTE modulation factor g(x).
The approach works particularly well once the spectral spikes begin to emerge from the background. In terms of a physical picture, what is actually emerging are the principle standing wave solutions for particular wavenumbers. One can see this in the LTE modulation spectrum below where there is a spike at a wavenumber at 1.5 and one at around 10 in panel A (isolating the sin spectrum and cosine spectrum separately instead of the quadrature of the two giving the spectral intensity). This is then reverse engineered as a fit to the actual LTE modulation g(x) in panel B. Panel D is the tidal forcing x=f(t) that minimized the Shannon entropy, thus creating the final fit g(f(t)) in panel C when the LTE modulation is applied to the forcing.
The approach does work, which is quite a boon to the efficiency of iterative fitting towards a solution, reducing the number of DOF involved in the calculation. Prior to this, a guess for the LTE modulation was required and the iterative fit would need to evolve towards the optimal modulation periods. In other words, either approach works, but the entropy approach may provide a quicker and more efficient path to discovering the underlying standing-wave order.
I will eventually add this to the LTE fitting software distro available on GitHub. This may also be applicable to other measures of entropy such as Tallis, Renyi, multi-scale, and perhaps Bispectral entropy, and will add those to the conventional Shannon entropy measure as needed.
In Chapter 12 of the book, we provide an empirical gravitational forcing term that can be applied to the Laplace’s Tidal Equation (LTE) solution for modeling ENSO. The inverse squared law is modified to a cubic law to take into account the differential pull from opposite sides of the earth.
The two main terms are the monthly anomalistic (Mm) cycle and the fortnightly tropical/draconic pair (Mf, Mf’ w/ a 18.6 year nodal modulation). Due to the inverse cube gravitational pull found in the denominator of F(t), faster harmonic periods are also created — with the 9-day (Mt) created from the monthly/fortnightly cross-term and the weekly (Mq) from the fortnightly crossed against itself. It’s amazing how few terms are needed to create a canonical fit to a tidally-forced ENSO model.
The recipe for the model is shown in the chart below (click to magnify), following sequentially steps (A) through (G) :
The tidal forcing is constrained by the known effects of the lunisolar gravitational torque on the earth’s length-of-day (LOD) variations. An essentially identical set of monthly, fortnightly, 9-day, and weekly terms are required for both a solid-body LOD model fit and a fluid-volume ENSO model fit.
If we apply the same tidal terms as forcing for matching dLOD data, we can use the fit below as a perturbed ENSO tidal forcing. Not a lot of difference here — the weekly harmonics are higher in magnitude.
So the only real unknown in this process is guessing the LTE modulation of steps (F) and (G). That’s what differentiates the inertial response of a spinning solid such as the earth’s core and mantle from the response of a rotating liquid volume such as the equatorial Pacific ocean. The former is essentially linear, but the latter is non-linear, making it an infinitely harder problem to solve — as there are infinitely many non-linear transformations one can choose to apply. The only reason that I stumbled across this particular LTE modulation is that it comes directly from a clever solution of Laplace’s tidal equations.
For the solution to Laplace’s Tidal Equation described in Chapter 12, the spatial and temporal results are separable, leading to a non-linear standing-wave time-series formulation:
sin(kx) sin(A sin(wt) )
By analogy to a linear standing-wave formulation, a solution such as
with the following traveling wave solution (propagating in the +x direction):
becomes the following in the non-linear LTE solution mode:
sin(kx – A sin(wt) )
This is also a traveling wave, but with the characteristic property of being able to periodically reverse direction from +x to –x depending on the value of A and w. As an intuitive aid, a standing wave can be considered as the superposition of two traveling waves traveling in opposite directions:
sin(kx – A sin(wt) ) + sin(kx + A sin(wt) )
Here the cross terms cancel after applying the trig identity on sums, and the separable standing-wave result similar to the first equation results. But, whenever there is an imbalance of +x and -x travelling waves, a periodic reversing traveling-wave/standing-wave mix results. This is shown in the following animation, where a mix of nonlinear traveling-waves and standing-waves show the periodic reversal in direction quite clearly.
This reversal is actually observed in ocean measurements, as exemplified in this recent research article:
- Delpech, A., Cravatte, S., Marin, F., Ménesguen, C. & Morel, Y. Deep Eddy Kinetic Energy in the Tropical Pacific From Lagrangian Floats. Journal of Geophysical Research: Oceans 125, e2020JC016313 (2020).
From their Figure 3, one can see this reversing process as the trajectory of a measured Argo float drift:
If that is not clear enough, the red arrows in the following annotated figure show the direction of the float motion. The drifting floats may not always exactly follow a trajectory as dictated by the velocity of a traveling wave, as this is partly a phase velocity with limited lateral volume displacement, but clearly a large wave-train such as a Tropical Instability Wave will certainly move a float. At least some of this is due to eddy behavior as the reversal is a natural consequence of a circular vortex motion of a large eddy.
Applying the LTE model to complete spatio-temporal data sets such as what Figure 3 is derived from would likely show an interesting match, adding value to the latest ENSO results, but this will require some digging into the data availability.
This topic will gain steam in the coming years. The following paper generates quite a good cross-validation for SOI, shown in the figure below.
- Xiaoqun, C. et al. ENSO prediction based on Long Short-Term Memory (LSTM). IOP Conference Series: Materials Science and Engineering, 799, 012035 (2020).
The x-axis appears to be in months and likely starts in 1979, so it captures the 2016 El Nino (El Nino is negative for SOI). Still have no idea how the neural net arrived at the fit other than it being able to discern the cyclic behavior from the historical waveform between 1979 and 2010. From the article itself, it appears that neither do the authors.Continue reading
For the tidal forcing that contributes to length-of-day (LOD) variations , only a few factors contribute to a plurality of the variation. These are indicated below by the highlighted circles, where the V0/g amplitude is greatest. The first is the nodal 18.6 year cycle, indicated by the N’ = 1 Doodson argument. The second is the 27.55 day “Mm” anomalistic cycle which is a combination of the perigean 8.85 year cycle (p = -1 Doodson argument) mixed with the 27.32 day tropical cycle (s=1 Doodson argument). The third and strongest is twice the tropical cycle (therefore s=2) nicknamed “Mf”.
These three factors also combine as the primary input forcing to the ENSO model. Yet, even though they are strongest, the combinatorial factors derived from multiplying these main harmonics are vital for generating a quality fit (both for dLOD and even more so for ENSO). What I have done in the past was apply the recommended mix of first- and second-order factors that appear in the dLOD spectra for the ENSO forcing.
Yet there is another approach that makes no assumption of the strongest 2nd-order factors. In this case, one simply expands the primary factors as a combinatorial expansion of cross-terms to the 4th level — this then generates a broad mix of monthly, fortnightly, 9-day, and weekly harmonic cycles. A nested algorithm to generate the 35 constituent terms is :
Counter := 1; for J in Constituents'Range loop for K in Constituents'First .. J loop for L in Constituents'First .. K loop for M in Constituents'First .. L loop Tf := Tf + Coefficients (Counter) * Fundamental(J) * Fundamental(K) * Fundamental(L) * Fundamental (M); Counter := Counter + 1; end loop; end loop; end loop; end loop;
This algorithm requires the three fundamental terms plus one unity term to capture most of the cross-terms shown in Table 3 above (The annual cross-terms are automatic as those are generated by the model’s annual impulse). This transforms into a coefficients array that can be included in the LTE search software.
What is missing from the list are the evection terms corresponding to 31.812 (Msm) and 27.093 day cycles. They are retrograde to the prograde 27.55 day anomalistic cycle, so would need an additional 8.848 year perigee cycle bring the count from 3 fundamental terms to 4.
The difference between adding an extra level of harmonics, bringing the combinatorial total from 35 to 126, is not very apparent when looking at the time series (below), as it simply adds shape to the main fortnightly tropical cycle.
Yet it has a significant effect on the ENSO fit, approaching a CC of 0.95 (see inset at right for the scatter correlation). Note that the forcing frequency spectra in the middle right inset still shows a predominately tropical fortnightly peak at 0.26/yr and 0.74/yr.
These extra harmonics also helps in matching to the much more busy SOI time-series. Click on the chart below to inspect how the higher-K wavenumbers may be the origin of what is thought to be noise in the SOI measurements.
Is this a case of overfitting? Try the following cross-validation on orthogonal intervals, and note how tight the model matches the data to the training intervals, without degrading too much on the outer validation region.
I will likely add this combinatorial expansion approach to the LTE fitting software on GitHub soon, but thought to checkpoint the interim progress on the blog. In the end the likely modeling mix will be a combination of the geophysical calibration to the known dLOD response together with a refined collection of these 2nd-order combinatorial tidal constituents. The rationale for why certain terms are important will eventually become more clear as well.
- Ray, R.D. and Erofeeva, S.Y., 2014. Long‐period tidal variations in the length of day. Journal of Geophysical Research: Solid Earth, 119(2), pp.1498-1509.