In Chapter 12, we described the model of QBO generated by modulating the draconic (or nodal) lunar forcing with a hemispherical annual impulse that reinforces that effect. This generates the following predicted frequency response peaks:

The 2nd, 3rd, and 4th peaks listed (at 2.423, 1.423, and 0.423) are readily observed in the power spectra of the QBO time-series. When the spectra are averaged over each of the time series, the precisely matched peaks emerge more cleanly above the red noise envelope — see the bottom panel in the figure below (click to expand).

The inset shows what these harmonics provide — essentially the jagged stairstep structure of the semi-annual impulse lag integrated against the draconic modulation.

It is important to note that these harmonics are not the traditional harmonics of a high-Q resonance behavior, where the higher orders are integral multiples of the fundamental frequency — in this case at 0.423 cycles/year. Instead, these are clear substantiation of a forcing response that maintains the frequency spectrum of an input stimulus, thus excluding the possibility that the QBO behavior is a natural resonance phenomena. At best, there may be a 2nd-order response that may selectively amplify parts of the frequency spectrum.

I presented at the 2018 AGU Fall meeting on the topic of cross-validation. From those early results, I updated a fitted model comparison between the Pacific ocean’s ENSO time-series and the Atlantic Ocean’s AMO time-series. The premise is that the tidal forcing is essentially the same in the two oceans, but that the standing-wave configuration differs. So the approach is to maintain a common-mode forcing in the two basins while only adjusting the Laplace’s tidal equation (LTE) modulation.

If you don’t know about these completely orthogonal time series, the thought that one can avoid overfitting the data — let alone two sets simultaneously — is unheard of (Michael Mann doesn’t even think that the AMO is a real oscillation based on reading his latest research article called “Absence of internal multidecadal and interdecadal oscillations in climate model simulations“).

H = The two tidal forcing inputs for ENSO and AMO — differs really only by scale and a slight offset

G = The constituent tidal forcing spectrum comparison of the two — primarily the expected main constituents of the Mf fortnightly tide and the Mm monthly tide (and the Mt composite of Mf × Mm), amplified by an annual impulse train which creates a repeating Brillouin zone in frequency space.

E&F = The LTE modulation for AMO, essentially comprised of one strong high-wavenumber modulation as shown in F

C&D = The LTE modulation for ENSO, a strong low-wavenumber that follows the El Nino La Nina cycles and then a faster modulation

B = The AMO fitted model modulating H with E

A = The ENSO fitted model modulating the other H with C

Ordinarily, this would take eons worth of machine learning compute time to determine this non-linear mapping, but with knowledge of how to solve Navier-Stokes, it becomes a tractable problem.

Now, with that said, what does this have to do with cross-validation? By fitting only to the ENSO time-series, the model produced does indeed have many degrees of freedom (DOF), based on the number of tidal constituents shown in G. Yet, by constraining the AMO fit to require essentially the same constituent tidal forcing as for ENSO, the number of additional DOF introduced is minimal — note the strong spike value in F.

Since parsimony of a model fit is based on information criteria such as number of DOF, as that is exactly what is used as a metric characterizing order in the previous post, then it would be reasonable to assume that fitting a waveform as complex as B with only the additional information of F cross-validates the underlying common-mode model according to any information criteria metric.

For the LTE formulation along the equator, the analytical solution reduces to g(f(t)), where g(x) is a periodic function. Without knowing what g(x) is, we can use the frequency-domain entropy or spectral entropy of the Fourier series mapping an estimated x=f(t) forcing amplitude to a measured climate index time series such as ENSO. The frequency-domain entropy is the sum or integral of this mapping of x to g(x) in reciprocal space applying the Shannon entropy –I(f)^{.}ln(I(f)) normalized over the I(f) frequency range, which is the power spectral (frequency) density of the mapping from the modeled forcing to the time-series waveform sample.

This measures the entropy or degree of disorder of the mapping. So to maximize the degree of order, we minimize this entropy value.

This calculated entropy is a single scalar metric that eliminates the need for evaluating various cyclic g(x) patterns to achieve the best fit. Instead, what it does is point to a highly-ordered spectrum (top panel in the above figure), of which the delta spikes can then be reverse engineered to deduce the primary frequency components arising from the the LTE modulation factor g(x).

The approach works particularly well once the spectral spikes begin to emerge from the background. In terms of a physical picture, what is actually emerging are the principle standing wave solutions for particular wavenumbers. One can see this in the LTE modulation spectrum below where there is a spike at a wavenumber at 1.5 and one at around 10 in panel A (isolating the sin spectrum and cosine spectrum separately instead of the quadrature of the two giving the spectral intensity). This is then reverse engineered as a fit to the actual LTE modulation g(x) in panel B. Panel D is the tidal forcing x=f(t) that minimized the Shannon entropy, thus creating the final fit g(f(t)) in panel C when the LTE modulation is applied to the forcing.

The approach does work, which is quite a boon to the efficiency of iterative fitting towards a solution, reducing the number of DOF involved in the calculation. Prior to this, a guess for the LTE modulation was required and the iterative fit would need to evolve towards the optimal modulation periods. In other words, either approach works, but the entropy approach may provide a quicker and more efficient path to discovering the underlying standing-wave order.

I will eventually add this to the LTE fitting software distro available on GitHub. This may also be applicable to other measures of entropy such as Tallis, Renyi, multi-scale, and perhaps Bispectral entropy, and will add those to the conventional Shannon entropy measure as needed.

Introduction From predictions of individual thunderstorms to projections of long-term global change, knowing the degree to which Earth system phenomena across a range of spatial and temporal scales are practicably predictable is vitally important to society. Past research in Earth System Predictability (ESP) led to profound insights that have benefited society by facilitating improved predictions and projections. However, as there is an increasing effort to accelerate progress (e.g., to improve prediction skill over a wider range of temporal and spatial scales and for a broader set of phenomena), it is increasingly important to understand and characterize predictability opportunities and limits. Improved predictions better inform societal resilience to extreme events (e.g., droughts and floods, heat waves wildfires and coastal inundation) resulting in greater safety and socioeconomic benefits. Such prediction needs are currently only partially met and are likely to grow in the future. Yet, given the complexity of the Earth system, in some cases we still do not have a clear understanding of whether or under which conditions underpinning processes and phenomena are predictable and why. A better understanding of ESP opportunities and limits is important to identify what Federal investments can be made and what policies are most effective to harness inherent Earth system predictability for improved predictions.

They outline these primary goals:

Goal 1: Advance foundational understanding and theory for an improved knowledge of Earth system predictability of practical utility.

Goal 2: Reduce gaps in the observations-based characterization of conditions, processes, and phenomena crucial for understanding and using Earth system predictability.

Goal 3: Accelerate the exploration and effective use of inherent Earth system predictability through advanced modeling.

Cross-Cutting Goal 1: Leverage emerging new hardware and software technologies for Earth system predictability R&D.

Cross-Cutting Goal 2: Optimize coordination of resources and collaboration among agencies and departments to accelerate progress.

Cross-Cutting Goal 3: Expand partnerships across disciplines and with entities external to the Federal Government to accelerate progress.

Cross-Cutting Goal 4: Include, inspire, and train the next generation of interdisciplinary scientists who can advance knowledge and use of Earth system predictability.

Essentially the idea is to get it done with whatever means are available, including applying machine learning/artificial intelligence. The problem is that they wish to “train the next generation of interdisciplinary scientists who can advance knowledge and use of Earth system predictability”. Yet, interdisciplinary scientists are not normally employed in climate science and earth science research. How many of these scientists have done materials science, condensed-matter physics, electrical, optics, controlled laboratory experimentation, mechanical, fluid, software engineering, statistics, signal processing, virtual simulations, applied math, AI, quantum and statistical mechanics as prerequisites to beginning study? It can be argued that all the tricks of these trades are required to make headway and to produce the next breakthrough.

The simple idea is that tidal forces play a bigger role in geophysical behaviors than previously thought, and thus helping to explain phenomena that have frustrated scientists for decades.

The idea is simple but the non-linear math (see figure above for ENSO) requires cracking to discover the underlying patterns.

The rationale for the ESD Ideas section in the EGU Earth System Dynamics journal is to get discussion going on innovative and novel ideas. So even though this model is worked out comprehensively in Mathematical Geoenergy, it hasn’t gotten much publicity.

In our book Mathematical GeoEnergy, several geophysical processes are modeled — from conventional tides to ENSO. Each model fits the data applying a concise physics-derived algorithm — the key being the algorithm’s conciseness but not necessarily subjective intuitiveness.

I’ve followed Gell-Mann’s work on complexity over the years and so will try applying his qualitative effective complexity approach to characterize the simplicity of the geophysics models described in the book and on this blog.

Here’s a breakdown from least complex to most complex

In Chapter 12 of the book, we provide an empirical gravitational forcing term that can be applied to the Laplace’s Tidal Equation (LTE) solution for modeling ENSO. The inverse squared law is modified to a cubic law to take into account the differential pull from opposite sides of the earth.

The two main terms are the monthly anomalistic (Mm) cycle and the fortnightly tropical/draconic pair (Mf, Mf’ w/ a 18.6 year nodal modulation). Due to the inverse cube gravitational pull found in the denominator of F(t), faster harmonic periods are also created — with the 9-day (Mt) created from the monthly/fortnightly cross-term and the weekly (Mq) from the fortnightly crossed against itself. It’s amazing how few terms are needed to create a canonical fit to a tidally-forced ENSO model.

The recipe for the model is shown in the chart below (click to magnify), following sequentially steps (A) through (G) :

The tidal forcing is constrained by the known effects of the lunisolar gravitational torque on the earth’s length-of-day (LOD) variations. An essentially identical set of monthly, fortnightly, 9-day, and weekly terms are required for both a solid-body LOD model fit and a fluid-volume ENSO model fit.

If we apply the same tidal terms as forcing for matching dLOD data, we can use the fit below as a perturbed ENSO tidal forcing. Not a lot of difference here — the weekly harmonics are higher in magnitude.

So the only real unknown in this process is guessing the LTE modulation of steps (F) and (G). That’s what differentiates the inertial response of a spinning solid such as the earth’s core and mantle from the response of a rotating liquid volume such as the equatorial Pacific ocean. The former is essentially linear, but the latter is non-linear, making it an infinitely harder problem to solve — as there are infinitely many non-linear transformations one can choose to apply. The only reason that I stumbled across this particular LTE modulation is that it comes directly from a clever solution of Laplace’s tidal equations.

This topic will gain steam in the coming years. The following paper generates quite a good cross-validation for SOI, shown in the figure below.

Xiaoqun, C. et al. ENSO prediction based on Long Short-Term Memory (LSTM). IOP Conference Series: Materials Science and Engineering, 799, 012035 (2020).

The x-axis appears to be in months and likely starts in 1979, so it captures the 2016 El Nino (El Nino is negative for SOI). Still have no idea how the neural net arrived at the fit other than it being able to discern the cyclic behavior from the historical waveform between 1979 and 2010. From the article itself, it appears that neither do the authors.

For the tidal forcing that contributes to length-of-day (LOD) variations [1], only a few factors contribute to a plurality of the variation. These are indicated below by the highlighted circles, where the V_{0}/g amplitude is greatest. The first is the nodal 18.6 year cycle, indicated by the N’ = 1 Doodson argument. The second is the 27.55 day “Mm” anomalistic cycle which is a combination of the perigean 8.85 year cycle (p = -1 Doodson argument) mixed with the 27.32 day tropical cycle (s=1 Doodson argument). The third and strongest is twice the tropical cycle (therefore s=2) nicknamed “Mf”.

These three factors also combine as the primary input forcing to the ENSO model. Yet, even though they are strongest, the combinatorial factors derived from multiplying these main harmonics are vital for generating a quality fit (both for dLOD and even more so for ENSO). What I have done in the past was apply the recommended mix of first- and second-order factors that appear in the dLOD spectra for the ENSO forcing.

Yet there is another approach that makes no assumption of the strongest 2nd-order factors. In this case, one simply expands the primary factors as a combinatorial expansion of cross-terms to the 4th level — this then generates a broad mix of monthly, fortnightly, 9-day, and weekly harmonic cycles. A nested algorithm to generate the 35 constituent terms is :

Counter := 1;
for J in Constituents'Range loop
for K in Constituents'First .. J loop
for L in Constituents'First .. K loop
for M in Constituents'First .. L loop
Tf := Tf + Coefficients (Counter) * Fundamental(J) *
Fundamental(K) * Fundamental(L) * Fundamental (M);
Counter := Counter + 1;
end loop;
end loop;
end loop;
end loop;

This algorithm requires the three fundamental terms plus one unity term to capture most of the cross-terms shown in Table 3 above (The annual cross-terms are automatic as those are generated by the model’s annual impulse). This transforms into a coefficients array that can be included in the LTE search software.

What is missing from the list are the evection terms corresponding to 31.812 (Msm) and 27.093 day cycles. They are retrograde to the prograde 27.55 day anomalistic cycle, so would need an additional 8.848 year perigee cycle bring the count from 3 fundamental terms to 4.

The difference between adding an extra level of harmonics, bringing the combinatorial total from 35 to 126, is not very apparent when looking at the time series (below), as it simply adds shape to the main fortnightly tropical cycle.

Yet it has a significant effect on the ENSO fit, approaching a CC of 0.95 (see inset at right for the scatter correlation). Note that the forcing frequency spectra in the middle right inset still shows a predominately tropical fortnightly peak at 0.26/yr and 0.74/yr.

These extra harmonics also helps in matching to the much more busy SOI time-series. Click on the chart below to inspect how the higher-K wavenumbers may be the origin of what is thought to be noise in the SOI measurements.

Is this a case of overfitting? Try the following cross-validation on orthogonal intervals, and note how tight the model matches the data to the training intervals, without degrading too much on the outer validation region.

I will likely add this combinatorial expansion approach to the LTE fitting software on GitHub soon, but thought to checkpoint the interim progress on the blog. In the end the likely modeling mix will be a combination of the geophysical calibration to the known dLOD response together with a refined collection of these 2nd-order combinatorial tidal constituents. The rationale for why certain terms are important will eventually become more clear as well.

References

Ray, R.D. and Erofeeva, S.Y., 2014. Long‐period tidal variations in the length of day. Journal of Geophysical Research: Solid Earth, 119(2), pp.1498-1509.

In Chapter 12 of the book we model — via LTE — the canonical El Nino Southern Oscillation (ENSO) behavior, fitting to closely-correlated indices such as NINO3.4 and SOI. Another El Nino index was identified circa 2007 that is not closely correlated to the well-known ENSO indices. This index, referred to as El Nino Modoki, appears to have more of a Pacific Ocean centered dipole shape with a bulge flanked by two wing lobes, cycling again as an erratic standing-wave.

If in fact Modoki differs from the conventional ENSO only by a different standing-wave wavenumber configuration, then it should be straightforward to model as an LTE variation of ENSO. The figure below is the model fitted to the El Nino Modoki Index (EMI) (data from JAMSTEC). The cross-validation is included as values post-1940 were used in the training with values prior to this used as a validation test.

The LTE modulation has a higher fundamental wavenumber component than ENSO (plus a weaker factor closer to a zero wavenumber, i.e. some limited LTE modulation as is found with the QBO model).

The input tidal forcing is close to that used for ENSO but appears to lead it by one year. The same strength ordering of tidal factors occurs, but with the next higher harmonic (7-day) of the tropical fortnightly 13.66 day tide slightly higher for EMI than ENSO.

The model fit is essentially a perturbation of ENSO so did not take long to optimize based on the Laplace’s Tidal Equation modeling software. I was provoked to run the optimization after finding a paper yesterday on using machine learning to model El Nino Modoki [1].

It’s clear that what needs to be done is a complete spatio-temporal model fit across the equatorial Pacific, which will be amazing as it will account for the complete mix of spatial standing-wave modes. Maybe in a couple of years the climate science establishment will catch up.