Cross-validation is essentially the ability to predict the characteristics of an unexplored region based on a model of an explored region. The explored region is often used as a training interval to test or validate model applicability on the unexplored interval. If some fraction of the expected characteristics appears in the unexplored region when the model is extrapolated to that interval, some degree of validation is granted to the model.
This is a powerful technique on its own as it is used frequently (and depended on) in machine learning models to eliminate poorly performing trials. But it gains even more importance when new data for validation will take years to collect. In particular, consider the arduous process of collecting fresh data for El Nino Southern Oscillation, which will take decades to generate sufficient statistical significance for validation.
So, what’s necessary in the short term is substantiation of a model’s potential validity. Nothing else will work as a substitute, as controlled experiments are not possible for domains as large as the Earth’s climate. Cross-validation remains the best bet.
Using as few independent parameters as possible, the difference in characterizing the temporal behavior of ENSO and AMO may amount to a standing-wave phase change. Noted earlier that ENSO and AMO can be derived from a common lunisolar forcing — and have now found that the LTE modulation is not that fundamentally different between the two.
The (nearly) common forcing
with the applied LTE of a 180° phase difference
leads to adequately fitted models to the respective time series
The fact that the fundamental (and 7th harmonic) are aligned between ENSO and AMO strongly suggest that the standing-wave wavenumbers are not governed by the basin geometry but are more of a global characteristic that remains coherent across the land masses. The Atlantic basin has a smaller width than the Pacific so intuitively one might have predicted unique wavenumbers that would fit within the bounding coastlines, but this is perhaps not the case.
Instead, the LTE modulation wraps around the earth and produces an anti-phase relationship in keeping with the approximately 180° longitudinal difference between the Atlantic and Pacific.
ENSO ~ sin (k F(t))
AMO ~sin (k F(t) + π + ϕ)
Any additional phase shift ϕ can also easily produce the anomalously large multidecadal variations in the AMO due to the biasing properties of the sinusoidal LTE modulation.
Just a matter of time until machine-learning algorithms start discovering these patterns. But, alas, they may not know how to deal with the findings
The premise of the paper is that the ocean will show modulation of mixing with a cycle of ~18 years corresponding to the 18.6-year lunar declination cycle. That may indeed be the case, but it likely pales in comparison to the other so-called long-period tidal cycles. In particular, every ~2 weeks the moon makes a complete north-south-north declination cycle that likely has a huge impact on the climate as it sloshes the subsurface thermocline (cite the paper by Lin & Qian1). Unfortunately, this much shorter cycle is not directly observed in the observational data, making it a challenge to determine how the pattern manifests itself. In the following, I will describe how this is accomplished, referring to the complete derivation found in Chapter 12 of Mathematical Geoenergy2.
Consider that the 2-week lunar declination cycle is observed very clearly in the Earth’s rotational speed, measured in terms of small transient changes in the length of day (LOD). From the IERS site, we can plot the differential LOD (dLOD) and fit to the known tidal factors, leaving a clean closed-form signal that one can use as a forcing function to evaluate the ocean response, in this case comparing it to the well-defined ENSO climate index.
The 18.6-year nodal cycle can be seen in the modulation of the cyclic dLOD data. At a higher resolution, the comparison is as follows:
To do that, we first make the assumption that the tidal cycle is modulated on an annual cycle, corresponding to the well-known “spring predictability barrier”. So, by integrating a sequence of May impulses against the value of the tidal forcing at that point, the following time series is generated.
Obviously, this does not match the ENSO NINO34 signal, but assuming that the subsurface response is non-linear (derivation in cite #2 below) and creates standing wave-modes based on the geometry of the ocean basin, then one can use a suitable transformation to potentially extract the pattern. The best approach based on the solution to the shallow-water wave model (i.e. Laplace’s Tidal Equations) is to map the input forcing (graph above) to the output corresponding to the NINO34 index, using a Fourier series expansion.
The result is the Laplace’s Tidal Equation (LTE) modulation spectra, shown below in a particular cross-validation configuration. Here, the NINO34 data is split into 2 halves, one time-series taken from 1870-1945 and the second from 1945-2020. The spectra were calculated individually and then multiplied point-by-point to identify long-lived stationary standing-wave nodes in the modulation. Thus, it isolates modulations that are common to each interval.
This is a log-plot, so the peak excursions shown are statistically significant and so can be modeled by a handful of quantifiable standing-wave modulations. The lowest wavenumber modulations are associated with the ENSO dipole modes and the higher wavenumber modulations are potentially associated with tropical instability waves (TIW)2.
As a final step, by applying this set of modulations to the lunisolar forcing (the blue chart above), a fit to the NINO34 time-series results. The chart shown below is a very good fit and can be cross-validated via several approaches10.
The mix of incommensurate tidal factors, the annual impulse, and a nonlinear response function is what causes the highly erratic nature of the ENSO waveform. It is neither chaotic nor random, as some researchers claim but instead is deterministically tied to the tidal and annual cycles, much like conventional tidal cycles have proven over the course of time.
To further quantify the decomposition of the tidal factors that force both the dLOD and the sloshing ENSO response, the paper by Ray and Erofeeva is vital8. When trying to understand the assignment of frequencies, note that after the annual impulse is applied, the known tidal factors corresponding to such tidal factors labelled Mf, Mm, etc get shifted from normal positions due to signal aliasing (see chart below in gray). This is a confusing factor to those who have not encountered aliasing before. As an example, the long-term modulation (>100 years) displayed in the blue chart above is due to the aliased 9.133 day Mt tidal factor, which almost synchronizes with the annual cycle, but the amount it is off leads to a gradual modulation in the forcing — so overall confusing in that a 9 day cycle could cause multidecadal changes.
Ding & Chao9 provide an independent analysis of LOD that provides a good cross-check to the non-aliased cross-factors. It may be possible to use lunar ephemeris data to calibrate the forcing but that adds degrees-of-freedom that could lead to over-fitting 10.
The reason that Lin & Qian were not able to further substantiate their claim of tidal forcing lies in that they could not associate the seasonal aliasing and a nonlinear mapping against their observations, only able to demonstrate the cause and effect of tidal forcing on the thermocline and thereby ruling out wind forcing. Other sources to cite are “Topological origin of equatorial waves” 4 and “Solar System Dynamics and Multiyear Droughts of the Western USA” 5, the latter discussing the impact of axial torques on the climate. Researchers at NASA JPL including J.H. Shirley, C. Perigaud6, and S.L. Marcus7 have touched on the LOD, lunar, ENSO connection over the years.
Bottom-line take aways :
Tidal factors are numerous so a measure such as dLOD is critical for calibrating the forcing.
Use the knowledge of a seasonal impulse, a la the spring predictability barrier, to advantage, while considering the temporal aliasing that it will cause.
The solution to the geophysical fluid dynamics produces a non-linear response, so clever transform techniques such as Fourier series are useful to isolate the pattern.
Ding, H., & Chao, B. F. (2018). Application of stabilized AR-z spectrum in harmonic analysis for geophysics. Journal of Geophysical Research: Solid Earth, 123, 8249– 8259. https://doi.org/10.1029/2018JB015890
In an earlier post, the observation was that ENSO models may not be unique due to the numerous possibilities provided by nonlinear math. This was supported by the fact that a tidal forcing model based on the Mf (13.66 day) tidal factor worked equally as well as a Mm (27.55 day) factor. This was not surprising considering that the aliasing against an annual impulse gave a similar repeat cycle — 3.8 years versus 3.9 years. But I have also observed that mixing the two in a linear fashion did not improve the fit much at all, as the difference created a long interference cycle which isn’t observed in the ENSO time series data. But then thinking in terms of the nonlinear modulation required, it may be that the two factors can be combined after the LTE solution is applied.
As the quality of the tidally-forced ENSO model improves, it’s instructive to evaluate its common-mode mechanism against other oceanic indices. So this is a re-evaluation of the Pacific Decadal Oscillation (PDO), in the context of non-autonomous solutions such as generated via LTE modulation. In particular, in this note we will clearly delineate the subtle distinction that arises when comparing ENSO and PDO. As background, it’s been frequently observed and reported that the PDO shows a resemblance to ENSO (a correlation coefficient between 0.5 and 0.6), but also demonstrates a longer multiyear behavior than the 3-7 year fluctuating period of ENSO, hence the decadal modifier.
A hypothesis based on LTE modulation is that decadal behavior arises from the shallowest modulation mode, and one that corresponds to even symmetry (i.e. cos not sin). So for a model that was originally fit to an ENSO time-series, it is anticipated that the modulation trending to a more even symmetry will reveal less rapid fluctuations — or in other words for an even f(x) = f(-x) symmetry there will be less difference between positive and negative excursions for a well-balanced symmetric input time-series. This should then exaggerate longer term fluctuations, such as in PDO. And for odd f(x) = -f(-x) symmetry it will exaggerate shorter term fluctuations leading to more spikiness, such as in ENSO.
Experimenting with linking to slide presentations instead of a trad blog post. The PDF linked below is an eye-opener as the NINO34 fit is the most parsimonious ever, at the expense of a higher LTE modulation (explained here). The cross-validation involves far fewer tidal factors than dealt with earlier, the two factors used (Mf and Mm tidal factors) rivaling the one factor used in QBO (described here).
The underlying structure of the solution shouldn’t be surprising, since as with Mach-Zehnder, it’s fundamentally related to a path integral formulation known from mathematical physics. As derived via quantum mechanics (originally by Feynman), one temporally integrates an energy Hamiltonian over a path allowing the wave function to interfere with itself over all possible wavenumber (k) and spatial states (x).
Because of the imaginary value i in the exponential, the result is a sinusoidal modulation of some (potentially complicated) function. Of course, the collective behavior of the ocean is not a quantum mechanical result applied to fluid dynamics, yet the topology of the equatorial waveguide can drive it to appear as one, see the breakthrough paper “Topological Origin of Equatorial Waves” for a rationale. (The curvature of the spherical earth can also provide a sinusoidal basis due to a trigonometric projection of tidal forces, but this is rather weak — not expanding far beyond a first-order expansion in the Taylor’s series)
Moreover, the rather strong interference may have a physical interpretation beyond the derived mathematical interpretation. In the past, I have described the modulation as wave breaking, in that the maximum excursions of the inner function f(t) are folded non-linearly into itself via the limiting sinusoidal wrapper. This is shown in the figure below for progressively increasing modulation.
In the figure above, I added an extra dimension (roughly implying a toroidal waveguide) which allows one to visualize the wave breaking, which otherwise would show as a progressively more rapid up-and-down oscillation in one dimension.
Perhaps coincidentally (or perhaps not) this kind of sinusoidal modulation also occurs in heuristic models of the double-gyre structure that often appears in fluid mechanics. In the excerpt below, note the sin(f(t)) formulation.
The interesting characteristic of the structure lies in the more rapid cyclic variations near the edge of the gyre, which can be seen in an animation (Jupyter notebook code here).
Whether the equivalent of a double-gyre is occurring via the model of the LTE 1-D equatorial waveguide is not clear, but the evidence of double-gyre wavetrains (Lagrangian coherent structures, Kelvin–Helmholtz instabilities), occurring along the equatorial Pacific is abundantly clear through the appearance of tropical instability (TIW) wavetrains.
These so-called coherent structures may be difficult to isolate for the time being, especially if they involve subtle interfaces such as thermocline boundaries :
Mercator analysis does show higher levels of waveguide modulation, so perhaps this will be better discriminated over time (see figure below with the long wavelength ENSO dipole superimposed along with the faster TIW wavenumbers in dashed line, with the double-gyre pairing in green + dark purple), and something akin to a 1-D gyre structure will become a valid description of what’s happening along the thermocline. In other words, the wave-breaking modulation due to the LTE modulation is essentially the same as the vortex gyre mapped into a 1-D waveguide.
Sea-level height has several scales. At the daily scale it represents the well-known lunisolar tidal cycle. At a multi-decadal, long-term scale it represents behaviors such as global warming. In between these two scales is what often appears to be noisy fluctuations to the untrained eye. Yet it’s fairly well-accepted  that much of this fluctuation is due to the side-effects of alternating La Nina and El Nino cycles (aka ENSO, the El Nino Southern Oscillation), as represented by measures such as NINO34 and SOI.
To see how startingly aligned this mapping is, consider the SLH readings from Ft. Denison in Sydney Harbor. The interval from 1980 to 2012 is shown below, along with a fit used recently to model ENSO.
As cross-validation, this fit is extrapolated backwards to show how it matches the historic SOI cycles
Much of the fine structure aligns well, indicating that intrinsically the dynamics behind sea-level-height at this scale are due to ENSO changes, associated with the inverted barometer effect. The SOI is essentially the pressure differential between Darwin and Tahiti, so the prevailing atmospheric pressure occurring during varying ENSO conditions follows the rising or lowering Sydney Harbor sea-level in a synchronized fashion. The change is 1 cm for a 1 mBar change in pressure, so that with the SOI extremes showing 14 mBar variation at the Darwin location, this accounts for a 14 cm change in sea-level, roughly matching that shown in the first chart. Note that being a differential measurement, SOI does not suffer from long-term secular changes in trend.
Yet, the unsaid implication in all this is that not only are the daily variations in SLH due to lunar and solar cyclic tidal forces, but so are these monthly to decadal variations. The longstanding impediment is that oceanographers have not been able to solve Laplace’s Tidal Equations that reflected the non-linear character of the ocean’s response to the long-period lunisolar forcing. Once that’s been analytically demonstrated, we can observe that both SLH and ENSO share essentially identical lunisolar forcing (see chart below), arising from that same common-mode linked mechanism.
 F. Zou, R. Tenzer, H. S. Fok, G. Meng and Q. Zhao, “The Sea-Level Changes in Hong Kong From Tide-Gauge Records and Remote Sensing Observations Over the Last Seven Decades,” in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 6777-6791, 2021, doi: 10.1109/JSTARS.2021.3087263.
Revisiting earlier modeling of the North Atlantic Oscillation (NAO) and Arctic Oscillation (AO) indices with the benefit of updated analysis approaches such as negative entropy. These two indices in particular are intimidating because to the untrained eye they appear to be more noise than anything deterministically periodic. Whereas ENSO has periods that range from 3 to 7 years, both NAO and AO show rapid cycling often on a faster-than-annual pace. The trial ansatz in this case is to adopt a semi-annual forcing pattern and synchronize that to long-period lunar factors, fitted to a Laplace’s Tidal Equation (LTE) model.
Start with candidate forcing time-series as shown below, with a mix of semi-annual and annual impulses modulating the primarily synodic/tropical lunar factor. The two diverge slightly at earlier dates (starting at 1880) but the NAO and AO instrumental data only begins at the year 1950, so the two are tightly correlated over the range of interest.
The intensity spectrum is shown below for the semi-annual zone, noting the aliased tropical factors at 27.32 and 13.66 days standing out.
The NAO and AO pattern is not really that different, and once a strong LTE modulation is found for one index, it also works for the other. As shown below, the lowest modulation is sharply delineated, yet more rapid than that for ENSO, indicating a high-wavenumber standing wave mode in the upper latitudes.
The model fit for NAO (data source) is excellent as shown below. The training interval only extended to 2016, so the dotted lines provide an extrapolated fit to the most recent NAO data.
Same for the AO (data source), the fit is also excellent as shown below. There is virtually no difference in the lowest LTE modulation frequency between NAO and AO, but the higher/more rapid LTE modulations need to be tuned for each unique index. In both cases, the extrapolations beyond the year 2016 are very encouraging (though not perfect) cross-validating predictions. The LTE modulation is so strong that it is also structurally sensitive to the exact forcing.
Both NAO and AO time-series appear very busy and noisy, yet there is very likely a strong underlying order due to the fundamental 27.32/13.66 day tropical forcing modulating the semi-annual impulse, with the 18.6/9.3 year and 8.85/4.42 year providing the expected longer-range lunar variability. This is also consistent with the critical semi-annual impulses that impact the QBO and Chandler wobble periodicity, with the caveat that group symmetry of the global QBO and Chandler wobble forcings require those to be draconic/nodal factors and not the geographically isolated sidereal/tropical factor required of the North Atlantic.
It really is a highly-resolved model potentially useful at a finer resolution than monthly and that will only improve over time.
(as a sidenote, this is much better attempt at matching a lunar forcing to AO and jet-stream dynamics than the approach Clive Best tried a few years ago. He gave it a shot but without knowledge of the non-linear character of the LTE modulation required he wasn’t able to achieve a high correlation, achieving at best a 2.4% Spearman correlation coefficient for AO in his Figure 4 — whereas the models in this GeoenergyMath post extend beyond 80% for the interval 1950 to 2016! )
Climate scientists as a general rule don’t understand crystallography deeply (I do). They also don’t understand cryptography (that, I don’t understand deeply either). Yet, as the last post indicated, knowledge of these two scientific domains is essential to decoding dipoles such as the El Nino Southern Oscillation (ENSO). Crystallography is basically an exercise in signal processing where one analyzes electron & x-ray diffraction patterns to be able to decode structure at the atomic level. It’s mathematical and not for people accustomed to existing outside of real space, as diffraction acts to transform the world of 3-D into a reciprocal space where the dimensions are inverted and common intuition fails.
Cryptography in its common use applies a key to enable a user to decode a scrambled data stream according to the instruction pattern embedded within the key. If diffraction-based crystallography required a complex unknown key to decode from reciprocal space, it would seem hopeless, but that’s exactly what we are dealing with when trying to decipher climate dipole time-series -— we don’t know what the decoding key is. If that’s the case, no wonder climate science has never made any progress in modeling ENSO, as it’s an existentially difficult problem.
The breakthrough is in identifying that an analytical solution to Laplace’s tidal equations (LTE) provides a crystallography+cryptography analog in which we can make some headway. The challenge is in identifying the decoding key (an unknown forcing) that would make the reciprocal-space inversion process (required for LTE demodulation) straightforward.
According to the LTE model, the forcing has to be a combination of tidal factors mixed with a seasonal cycle (stages 1 & 2 in the figure above) that would enable the last stage (Fourier series a la diffraction inversion) to be matched to empirical observations of a climate dipole such as ENSO.
The forcing key used in an ENSO model was described in the last post as a predominately Mm-based lunar tidal factorization as shown below, leading to an excellent match to the NINO34 time series after a minimally-complex LTE modulation is applied.
Critics might say and justifiably so, that this is potentially an over-fit to achieve that good a model-to-data correlation. There are too many degrees of freedom (DOF) in a tidal factorization which would allow a spuriously good fit depending on the computational effort applied (see Reference 1 at the end of this post).
Yet, if the forcing key used in the ENSO model was reused as is in fitting an independent climate dipole, such as the AMO, and this same key required little effort in modeling AMO, then the over-fitting criticism is invalidated. What’s left to perform is finding a distinct low-DOF LTE modulation to match the AMO time-series as shown below.
This is an example of a common-mode cross-validation of an LTE model that I originally suggested in an AGU paper from 2018. Invalidating this kind of analysis is exceedingly difficult as it requires one to show that the erratic cycling of AMO can be randomly created by a few DOF. In fact, a few DOFs of sinusoidal factors to reproduce the dozens of AMO peaks and valleys shown is virtually impossible to achieve. I leave it to others to debunk via an independent analysis.
addendum: LTE modulation comparisons, essentially the wavenumber of the diffraction signal:
This is the forcing power spectrum showing the principal Mm tidal factor term at period 3.9 years, with nearly identical spectral profiles for both ENSO and AMO.
According to the precepts of cryptography, decoding becomes straightforward once one knows the key. Similarly, nature often closely guards its secrets, and until the key is known, for example as with DNA, climate scientists will continue to flounder.
Chao, B. F., & Chung, C. H. (2019). On Estimating the Cross Correlation and Least Squares Fit of One Data Set to Another With Time Shift. Earth and Space Science, 6, 1409–1415. https://doi.org/10.1029/2018EA000548 “For example, two time series with predominant linear trends (very low DOF) can have a very high ρ (positive or negative), which can hardly be construed as an evidence for meaningful physical relationship. Similarly, two smooth time series with merely a few undulations of similar timescale (hence low DOF) can easily have a high apparent ρ just by fortuity especially if a time shift is allowed. On the other hand, two very “erratic” or, say, white time series (hence high DOF) can prove to be significantly correlated even though their apparent ρ value is only moderate. The key parameter of relevance here is the DOF: A relatively high ρ for low DOF may be less significant than a relatively low ρ at high DOF and vice versa.“