The Gist site on GitHub allows you to comment on posts very easily. For example, images of charts can be pasted in the discussion area. Also snippets of code can be added and updated, which is useful for neural net evaluation. The following is a link to an initial Gist area for evaluating LTE models.

# AMO

# Sub(Surface)Stack

I signed up for a SubStack account awhile ago and recently published two articles on this account (SubSurface) in the last week.

- https://pukite.substack.com/p/machine-learning-validates-the-enso
- https://pukite.substack.com/p/machine-learning-validates-the-amo

The SubStack authoring interface has good math equation mark-up, convenient graphics embedding, and an excellent footnoting system. On first pass, it only lacks control over font color.

The articles are focused on applying neural network cross-validation to ENSO and AMO modeling, as suggested previously. I haven’t completely explored the configuration space but one aspect that may becoming clear is the value of wavelet neural networks (WNN) for time-series analysis. The WNN approach seems much more amenable to extracting sinusoidal modulation of the input-to-output mapping — trained on a rather short interval and then cross-validated out-of-band. The Mexican hat wavelet (2nd derivative of a Gaussian) as an activation function in particular locks in quickly to an LTE modulation that took longer to find with the custom search software I have developed at GitHub. I think the reason for the efficiency is that it’s optimizing to a Taylor’s series expansion of the input terms, a classic nonlinear expansion that NN’s excel at.

The following training run using the Mexican hat activation and ADAM optimizer is an eye-opener, as it achieved an admirable fit within a minute of computation.

The **GREEN **on **BLUE **is **training **on NINO4 **data **over two end-point intervals, with the **RED **cross-validation over the out-of-band region. The correlation coefficient is 0.34, which is impressive considering the nature of the waveform. Clearly there is similarity.

Moreover, if we compare the model fit to data via the WNN against the LTE harmonics approach, you can also see where the two fare equally poorly. Below in the outer frame is the NINO4 LTE fit with the **YELLOW **arrow pointing downward at a discrepancy (a peak in the data not resolved in the fit). In comparison the yellow-bordered inset shows the same discrepancy on the WNN training run. So the fingerprints essentially match with no coaching.

The neural net chain is somewhat deep with 6 layers, but I think this is needed to expand to the higher-order terms in the Taylor’s series. In the directed graph below, L01 is the input tidal forcing and L02 is the time axis (with an initial very low weighting).

It also appears temporally stationary across the entire time-span, so that the WNN temporal contribution appears minimal.

In a previous fit the horizontal striations (indicating modulation factor at a forcing level) matched with the LTE model, providing further evidence that the the WNN was mapping to an optimal modulation.

The other Sub(Surface)Stack article is on the AMO, which also reveals promising results. This is a video of the training in action

# Limits of Predictability?

A decade-old research article on modeling equatorial waves includes this introductory passage:

“Nonlinear aspects plays a major role in the understanding of fluid flows. The distinctive fact that in nonlinear problems cause and effect are not proportional opens up the possibility that a small variation in an input quantity causes a considerable change in the response of the system. Often this type of complication causes nonlinear problems to elude exact treatment. “

https://doi.org/10.1029/2012JC007879

From my experience if it is relatively easy to generate a fit to data via a nonlinear model then it also may be easy to diverge from the fit with a small structural perturbation, or to come up with an alternative fit with a different set of parameters. This makes it difficult to establish an iron-clad cross-validation.

This doesn’t mean we don’t keep trying. Applying the dLOD calibration approach to an applied forcing, we can model ENSO via the NINO34 climate index across the available data range (in YELLOW) in the figure below (parameters here)

The lower right box is a ** modulo-2π** reduction of the tidal forcing as an input to the sinusoidal LTE modulation, using the decline rate (per month) as the divisor. Why this works so well

*per month*in contrast to

*per year*(where an annual cycle would make sense) is not clear. It is also fascinating in that this is a form of

*amplitude*aliasing analogous to the

*frequency*aliasing that also applies a

**folding reduction to the tidal periods less than the Nyquist monthly sampling criteria. There may be a time-amplitude duality or Lagrangian particle-relabeling in operation that has at its central core the trivial solutions of Navier-Stokes or Euler differential equations when all segments of forcing are flat or have a linear slope. Trivial in the sense that when a forcing is flat or has a 1st-order slope, the 2nd derivatives due to divergence in the differential equations vanish (quasi-static). This means that only the discontinuities, which occur concurrently with the annual ENSO predictability barrier, need to be treated carefully (the**

*modulo*-2*π***folding could be a topological Berry phase jump?). Yet, if these transitions are enhanced by metastable interface instabilities as during thermocline turn-over then the differential equation conditions could be transiently relaxed via a vanishing density difference. Much happens during a turn-over, but it doesn’t last long, perhaps indicating a geometric phase. MV Berry also discusses phase changes in the context of amphidromic tidal singularities here.**

*modulo*-2*π*Suffice to say that the topological properties of reduced dimension volumes and at interfaces remain mysterious. The main takeaway is that a working NINO34-fitted ENSO model is produced, and if not here then somewhere else a machine-learning algorithm will discover it.

The key next step is to apply the same tidal forcing to an AMO model, taking care not to change the tidal factors enough to produce a highly sensitive nonlinear response in the LTE model. So we retain an excluded interval from training (in YELLOW below) and only adjust the LTE parameters for the region surrounding this zone during the fitting process (parameters here).

The cross-validation agreement is breathtakingly good in the excluded (out-of-band) training interval. There is zero cross-correlation between the NINO34 and AMO time-series to begin with so that this is likely revealing the true emergent characteristics of a tidally forced mechanism.

As usual all the introductory work is covered in Mathematical Geoenergy

- https://agupubs.onlinelibrary.wiley.com/doi/10.1002/9781119434351.ch11 (wind)
- https://agupubs.onlinelibrary.wiley.com/doi/10.1002/9781119434351.ch12 (wave)

A community peer-review contributed to a recent QBO article is here and PDF here. The same question applies to QBO as ENSO or AMO: is it possible to predict future behavior? Is the QBO model less sensitive to input since the nonlinear aspect is weaker?

*Added several weeks later*: This monograph PDF available “Introduction to Geophysical Fluid Dynamics: Physical and Numerical Aspects”. Ignoring higher-order time derivatives is key to solving LTE.

Note the cite to Billy Kessler

# Climate Dipoles as crystal-crypto

Climate scientists as a general rule don’t understand crystallography deeply (I do). They also don’t understand cryptography (that, I don’t understand deeply either). Yet, as the last post indicated, knowledge of these two scientific domains is essential to decoding dipoles such as the El Nino Southern Oscillation (ENSO). Crystallography is basically an exercise in signal processing where one analyzes electron & x-ray diffraction patterns to be able to decode structure at the atomic level. It’s mathematical and not for people accustomed to existing outside of *real space*, as diffraction acts to transform the world of 3-D into a *reciprocal space* where the dimensions are inverted and common intuition fails.

Cryptography in its common use applies a *key *to enable a user to decode a scrambled data stream according to the instruction pattern embedded within the key. If diffraction-based crystallography required a complex unknown key to decode from reciprocal space, it would seem hopeless, but that’s exactly what we are dealing with when trying to decipher climate dipole time-series -— we don’t know what the decoding key is. If that’s the case, no wonder climate science has never made any progress in modeling ENSO, as it’s an existentially difficult problem.

The breakthrough is in identifying that an analytical solution to Laplace’s tidal equations (LTE) provides a crystallography+cryptography analog in which we can make some headway. The challenge is in identifying the decoding key (an unknown forcing) that would make the reciprocal-space inversion process (required for LTE demodulation) straightforward.

According to the LTE model, the forcing has to be a combination of tidal factors mixed with a seasonal cycle (stages 1 & 2 in the figure above) that would enable the last stage (Fourier series a la diffraction inversion) to be matched to empirical observations of a climate dipole such as ENSO.

The forcing key used in an ENSO model was described in the last post as a predominately *Mm*-based lunar tidal factorization as shown below, leading to an excellent match to the NINO34 time series after a minimally-complex LTE modulation is applied.

Critics might say and justifiably so, that this is potentially an over-fit to achieve that good a model-to-data correlation. There are too many degrees of freedom (DOF) in a tidal factorization which would allow a spuriously good fit depending on the computational effort applied (see **Reference 1** at the end of this post).

Yet, if the forcing key used in the ENSO model was **reused as is **in fitting an independent climate dipole, such as the AMO, and this same key required little effort in modeling AMO, then the over-fitting criticism is invalidated. What’s left to perform is finding a distinct low-DOF LTE modulation to match the AMO time-series as shown below.

This is an example of a *common-mode cross-validation* of an LTE model that I originally suggested in an AGU paper from 2018. Invalidating this kind of analysis is exceedingly difficult as it requires one to show that the erratic cycling of AMO can be randomly created by a few DOF. In fact, a few DOFs of sinusoidal factors to reproduce the dozens of AMO peaks and valleys shown is virtually impossible to achieve. I leave it to others to debunk via an independent analysis.

addendum: LTE modulation comparisons, essentially the wavenumber of the diffraction signal:

This is the *forcing *power spectrum showing the principal * Mm *tidal factor term at period 3.9 years, with nearly identical spectral profiles for both ENSO and AMO.

According to the precepts of cryptography, decoding becomes straightforward once one knows the key. Similarly, nature often closely guards its secrets, and until the key is known, for example as with DNA, climate scientists will continue to flounder.

** References **

- Chao, B. F., & Chung, C. H. (2019). On Estimating the Cross Correlation and Least Squares Fit of One Data Set to Another With Time Shift.
*Earth and Space Science*, 6, 1409–1415. https://doi.org/10.1029/2018EA000548

“*For example, two time series with predominant linear trends (very low DOF) can have a very high ρ (positive or negative), which can hardly be construed as an evidence for meaningful physical relationship. Similarly, two smooth time series with merely a few undulations of similar timescale (hence low DOF) can easily have a high apparent ρ just by fortuity especially if a time shift is allowed. On the other hand, two very “erratic” or, say, white time series (hence high DOF) can prove to be significantly correlated even though their apparent ρ value is only moderate. The key parameter of relevance here is the DOF: A relatively high ρ for low DOF may be less significant than a relatively low ρ at high DOF and vice versa.*“

Continue reading

# Overfitting+Cross-Validation: ENSO→AMO

I presented at the 2018 AGU Fall meeting on the topic of cross-validation. From those early results, I updated a fitted model comparison between the Pacific ocean’s ENSO time-series and the Atlantic Ocean’s AMO time-series. The premise is that the tidal forcing is essentially the same in the two oceans, but that the standing-wave configuration differs. So the approach is to maintain a common-mode forcing in the two basins while only adjusting the Laplace’s tidal equation (LTE) modulation.

If you don’t know about these completely orthogonal time series, the thought that one can avoid overfitting the data — let alone two sets simultaneously — is unheard of (Michael Mann doesn’t even think that the AMO is a real oscillation based on reading his latest research article called “Absence of internal multidecadal and interdecadal oscillations in climate model simulations“).

This is the latest product (click to expand)

Read this backwards from **H** to **A**.

**H** = The two tidal forcing inputs for ENSO and AMO — differs really only by scale and a slight offset

**G** = The constituent tidal forcing spectrum comparison of the two — primarily the expected main constituents of the **Mf **fortnightly tide and the **Mm **monthly tide (and the **Mt **composite of **Mf** × **Mm**), amplified by an annual impulse train which creates a repeating Brillouin zone in frequency space.

**E&F** = The LTE modulation for AMO, essentially comprised of one strong high-wavenumber modulation as shown in **F**

**C&D** = The LTE modulation for ENSO, a strong low-wavenumber that follows the El Nino La Nina cycles and then a faster modulation

**B** = The AMO fitted model modulating **H** with **E**

**A** = The ENSO fitted model modulating the other **H** with **C**

Ordinarily, this would take eons worth of machine learning compute time to determine this non-linear mapping, but with knowledge of how to solve Navier-Stokes, it becomes a tractable problem.

Now, with that said, what does this have to do with cross-validation? By fitting only to the ENSO time-series, the model produced does indeed have many degrees of freedom (DOF), based on the number of tidal constituents shown in **G**. Yet, by constraining the AMO fit to require essentially the same constituent tidal forcing as for ENSO, the number of additional DOF introduced is minimal — note the strong spike value in **F**.

Since parsimony of a model fit is based on information criteria such as number of DOF, as that is exactly what is used as a metric characterizing order in the previous post, then it would be reasonable to assume that fitting a waveform as complex as **B **with only the additional information of **F **cross-validates the underlying common-mode model according to any information criteria metric.

For further guidance, this is an informative article on model selection in regards to complexity — “A Primer for Model Selection: The Decisive Role of Model Complexity“

*excerpt*:

# The PDO

In Chapter 12 of the book, the math model behind the equatorial Pacific ocean dipole known as the ENSO (El Nino /Southern Oscillation) was presented. Largely distinct to that, the climate index referred to as the Pacific Decadal Oscillation (PDO) occurs in the northern Pacific. As with modeling the AMO, understanding the dynamics of the PDO helps cross-validate the LTE theory for dipoles such as ENSO, as reported at the 2018 Fall Meeting of the AGU (poster). Again, if we can apply an identical forcing for PDO as for AMO and ENSO, then we can further cross-validate the LTE model. So by reusing that same forcing for an independent climate index such as PDO, we essentially remove a large number of degrees of freedom from the model and thus defend against claims of over-fitting.

Continue reading# Tropical Instability Waves

In Chapter 12 of the book, we present the hypothesis that tropical instability waves (TIW) of the equatorial Pacific are the higher wavenumber (and higher frequency) companion to the lower wavenumber ENSO (El Nino /Southern Oscillation) behavior. See** Fig 1** below.

TIW wavetrains are also observed in the equatorial Atlantic so would be considered alongside the AMO there as the high wavenumber and low wavenumber pairing.

Continue reading# The AMO

In Chapter 12 of the book, we focused on modeling the standing-wave behavior of the Pacific ocean dipole referred to as ENSO (El Nino /Southern Oscillation). Because it has been in climate news recently, it makes sense to give equal time to the Atlantic ocean equivalent to ENSO referred to as the Atlantic Multidecadal Oscillation (AMO). The original rationale for modeling AMO was to determine if it would help cross-validate the LTE theory for equatorial climate dipoles such as ENSO; this was reported at the 2018 Fall Meeting of the AGU (poster). The approach was similar to that applied for other dipoles such as the IOD *(which is also in the news recently with respect to Australia bush fires and in how multiple dipoles can amplify climate extremes *[1]*)* — and so if we can apply an identical forcing for AMO as for ENSO then we can further cross-validate the LTE model. So by reusing that same forcing for an independent climate index such as AMO, we essentially remove a large number of degrees of freedom from the model and thus defend against claims of over-fitting.

# AO, PNA, & SAM Models

In Chapter 11, we developed a general formulation based on Laplace’s Tidal Equations (LTE) to aid in the analysis of standing wave climate models, focusing on the ENSO and QBO behaviors in the book. As a means of cross-validating this formulation, it makes sense to test the LTE model against other climate indices. So far we have extended this to PDO, AMO, NAO, and IOD, and to complete the set, in this post we will evaluate the northern latitude indices comprised of the Arctic Oscillation/Northern Annular Mode (AO/NAM) and the Pacific North America (PNA) pattern, and the southern latitude index referred to as the Southern Annular Mode (SAM). We will first evaluate AO and PNA in comparison to its close relative NAO and then SAM …

Continue reading# North Atlantic Oscillation

In Chapter 12 of the book, we derived an ENSO standing wave model based on an analytical Laplace’s Tidal Equation formulation. The results of this were so promising that they were also applied successfully to two other similar oceanic dipoles, the Atlantic Multidecadal Oscillation (AMO) and the Pacific Decadal Oscillation (PDO), which were reported at last year’s American Geophysical Union (AGU) conference. For that presentation, an initial attempt was made to model the North Atlantic Oscillation (NAO), which is a more rapid cycle, consisting of up to two periods per year, in contrast to the El Nino peaks of the ENSO time-series which occur every 2 to 7 years. Those results were somewhat inconclusive, so are revisited in the following post:

Continue reading