# Nonlinear Generation of Power Spectrum : ENSO

Something I learned early on in my research career is that complicated frequency spectra can be generated from simple repeating structures. Consider the spatial frequency spectra produced as a diffraction pattern produced from a crystal lattice. Below is a reflected electron diffraction pattern of a reconstructed hexagonally reconstructed surface of a silicon (Si) single crystal with a lead (Pb) adlayer ( (a) and (b) are different alignments of the beam direction with respect to the lattice). Suffice to say, there is enough information in the patterns to be able to reverse engineer the structure of the surface as (c).

Now consider the ENSO pattern. At first glance, neither the time-series signal nor the Fourier series power spectra appear to be produced by anything periodically regular. Even so, let’s assume that the underlying pattern is tidally regular, being comprised of the expected fortnightly 13.66 day tropical/synodic cycle and the monthly 27.55 day anomalistic cycle synchronized by an annual impulse. Then the forcing power spectrum of f(t) looks like the RED trace on the left-side of the figure below, F(ω). Clearly that is not enough of a frequency spectra (a few delta spikes) necessary to make up the empirically calculated Fourier series for the ENSO data comprising ~40 intricately placed peaks between 0 and 1 cycles/year in BLUE.

Yet, if we modulate that with an Laplace’s Tidal Equation solution functional g(f(t)) that has a G(ω) as in the yellow inset above — a cyclic modulation of amplitudes where g(x) is described by two distinct sine-waves — then the complete ENSO spectra is fleshed out in BLACK in the figure above. The effective g(x) is shown in the figure below, where a slower modulation is superimposed over a faster modulation.

So essentially what this is suggesting is that a few tidal factors modulated by two sinusoids produces enough spectral detail to easily account for the ~40 peaks in the ENSO power spectra. It can do this because a modulating sinusoid is an efficient harmonics and cross-harmonics generator, as the Taylor’s series of a sinusoid contains an effectively infinite number of power terms.

To see this process in action, consider the following three figures, which features a slider that allows one to get an intuitive feel for how the LTE modulation adds richness via harmonics in the power spectra.

1. Start with a mild LTE modulation and start to increase it as in the figure below. A few harmonics begin to emerge as satellites surrounding the forcing harmonics in RED.

2. Next, increase the LTE modulation so that it models the slower sinusoid — more harmonics emerge

3. Then add the faster sinusoid, to fully populate the empirically observed ENSO spectral peaks (and matching the time series).

It appears as if by magic, but this is the power of non-linear harmonic generation. Note that the peak labeled AB amongst others is derived from the original A and B as complicated satellite-cross terms, which can be accounted for by expanding all of the terms in the Taylor’s series of the sinusoids. This can be done with some difficulty, or left as is when doing the fit via solver software.

To complete the circle, it’s likely that being exposed to mind-blowing Fourier series early on makes Fourier analysis of climate data less intimidating, as one can apply all the tricks-of-the-trade, which, alas, are considered routine in other disciplines.

Individual charts

https://imagizer.imageshack.com/img922/7013/VRro0m.png



# The Search for Order

For the LTE formulation along the equator, the analytical solution reduces to g(f(t)), where g(x) is a periodic function. Without knowing what g(x) is, we can use the frequency-domain entropy or spectral entropy of the Fourier series mapping an estimated x=f(t) forcing amplitude to a measured climate index time series such as ENSO. The frequency-domain entropy is the sum or integral of this mapping of x to g(x) in reciprocal space applying the Shannon entropy –I(f).ln(I(f)) normalized over the I(f) frequency range, which is the power spectral (frequency) density of the mapping from the modeled forcing to the time-series waveform sample.

This measures the entropy or degree of disorder of the mapping. So to maximize the degree of order, we minimize this entropy value.

This calculated entropy is a single scalar metric that eliminates the need for evaluating various cyclic g(x) patterns to achieve the best fit. Instead, what it does is point to a highly-ordered spectrum (top panel in the above figure), of which the delta spikes can then be reverse engineered to deduce the primary frequency components arising from the the LTE modulation factor g(x).

The approach works particularly well once the spectral spikes begin to emerge from the background. In terms of a physical picture, what is actually emerging are the principle standing wave solutions for particular wavenumbers. One can see this in the LTE modulation spectrum below where there is a spike at a wavenumber at 1.5 and one at around 10 in panel A (isolating the sin spectrum and cosine spectrum separately instead of the quadrature of the two giving the spectral intensity). This is then reverse engineered as a fit to the actual LTE modulation g(x) in panel B. Panel D is the tidal forcing x=f(t) that minimized the Shannon entropy, thus creating the final fit g(f(t)) in panel C when the LTE modulation is applied to the forcing.

The approach does work, which is quite a boon to the efficiency of iterative fitting towards a solution, reducing the number of DOF involved in the calculation. Prior to this, a guess for the LTE modulation was required and the iterative fit would need to evolve towards the optimal modulation periods. In other words, either approach works, but the entropy approach may provide a quicker and more efficient path to discovering the underlying standing-wave order.

I will eventually add this to the LTE fitting software distro available on GitHub. This may also be applicable to other measures of entropy such as Tallis, Renyi, multi-scale, and perhaps Bispectral entropy, and will add those to the conventional Shannon entropy measure as needed.

# Complexity vs Simplicity in Geophysics

In our book Mathematical GeoEnergy, several geophysical processes are modeled — from conventional tides to ENSO. Each model fits the data applying a concise physics-derived algorithm — the key being the algorithm’s conciseness but not necessarily subjective intuitiveness.

I’ve followed Gell-Mann’s work on complexity over the years and so will try applying his qualitative effective complexity approach to characterize the simplicity of the geophysics models described in the book and on this blog.

Here’s a breakdown from least complex to most complex

# Asymptotic QBO Period

The modeled QBO cycle is directly related to the nodal (draconian) lunar cycle physically aliased against the annual cycle.  The empirical cycle period is best estimated by tracking the peak acceleration of the QBO velocity time-series, as this acceleration (1st derivative of the velocity) shows a sharp peak. This value should asymptotically approach a 2.368 year period over the long term.  Since the recent data from the main QBO repository provides an additional acceleration peak from the past month, now is as good a time as any to analyze the cumulative data.

The new data-point provides a longer period which compensated for some recent shorter periods, such that the cumulative mean lies right on the asymptotic line. The jitter observed is explainable in terms of the model, as acceleration peaks are more prone to align close to an annual impulse. But the accumulated mean period is still aligned to the draconic aliasing with this annual impulse. As more data points come in over the coming decades, the mean should vary less and less from the asymptotic value.

The fit to QBO using all the data save for the last available data point is shown below.  Extrapolating beyond the green arrow, we should see an uptick according to the red waveform.

Adding the recent data-point and the blue waveform does follow the model.

There was a flurry of recent discussion on the QBO anomaly of 2016 (shown as a split peak above), which implied that perhaps the QBO would be permanently disrupted from it’s long-standing pattern. Instead, it may be a more plausible explanation that the QBO pattern was not simply wandering from it’s assumed perfectly cyclic path but instead is following a predictable but jittery track that is a combination of the (physically-aliased) annual impulse-synchronized Draconic cycle together with a sensitivity to variations in the draconic cycle itself. The latter calibration is shown below, based on NASA ephermeris.

This is the QBO spectral decomposition, showing signal strength centered on the fundamental aliased Draconic value, both for the data and the set by the model.

The main scientist, Prof. Richard Lindzen, behind the consensus QBO model has been recently introduced here as being “considered the most distinguished living climate scientist on the planet”.  In his presentation criticizing AGW science [1], Lindzen claimed that the climate oscillates due to a steady uniform force, much like a violin oscillates when the steady force of a bow is drawn across its strings.  An analogy perhaps better suited to reality is that the violin is being played like a drum. Resonance is more of a decoration to the beat itself.
Keith ?

[1] Professor Richard Lindzen slammed conventional global warming thinking warming as ‘nonsense’ in a lecture for the Global Warming Policy Foundation on Monday. ‘An implausible conjecture backed by false evidence and repeated incessantly … is used to promote the overturn of industrial civilization,’ he said in London. — GWPF

# NAO

The challenge of validating the models of climate oscillations such as ENSO and QBO, rests primarily in our inability to perform controlled experiments. Because of this shortcoming, we can either do (1) predictions of future behavior and validate via the wait-and-see process, or (2) creatively apply techniques such as cross-validation on currently available data. The first is a non-starter because it’s obviously pointless to wait decades for validation results to confirm a model, when it’s entirely possible to do something today via the second approach.

There are a variety of ways to perform model cross-validation on measured data.

In its original and conventional formulation, cross-validation works by checking one interval of time-series against another, typically by training on one interval and then validating on an orthogonal interval.

Another way to cross-validate is to compare two sets of time-series data collected on behaviors that are potentially related. For example, in the case of ocean tidal data that can be collected and compared across spatially separated geographic regions, the sea-level-height (SLH) time-series data will not necessarily be correlated, but the underlying lunar and solar forcing factors will be closely aligned give or take a phase factor. This is intuitively understandable since the two locations share a common-mode signal forcing due to the gravitational pull of the moon and sun, with the differences in response due to the geographic location and local spatial topology and boundary conditions. For tides, this is a consensus understanding and tidal prediction algorithms have stood the test of time.

In the previous post, cross-validation on distinct data sets was evaluated assuming common-mode lunisolar forcing. One cross-validation was done between the ENSO time-series and the AMO time-series. Another cross-validation was performed for ENSO against PDO. The underlying common-mode lunisolar forcings were highly correlated as shown in the featured figure.  The LTE spatial wave-number weightings were the primary discriminator for the model fit. This model is described in detail in the book Mathematical GeoEnergy to be published at the end of the year by Wiley.

Another common-mode cross-validation possible is between ENSO and QBO, but in this case it is primarily in the Draconic nodal lunar factor — the cyclic forcing that appears to govern the regular oscillations of QBO.  Below is the Draconic constituent comparison for QBO and the ENSO.

The QBO and ENSO models only show a common-mode correlated response with respect to the Draconic forcing. The Draconic forcing drives the quasi-periodicity of the QBO cycles, as can be seen in the lower right panel, with a small training window.

This cross-correlation technique can be extended to what appears to be an extremely erratic measure, the North Atlantic Oscillation (NAO).

Like the SOI measure for ENSO, the NAO is originally derived from a pressure dipole measured at two separate locations — but in this case north of the equator.  From the high-frequency of the oscillations, a good assumption is that the spatial wavenumber factors are much higher than is required to fit ENSO. And that was the case as evidenced by the figure below.

ENSO vs NAO cross-validation

Both SOI and NAO are noisy time-series with the NAO appearing very noisy, yet the lunisolar constituent forcings are highly synchronized as shown by correlations in the lower pane. In particular, summing the Anomalistic and Solar constituent factors together improves the correlation markedly, which is because each of those has influence on the other via the lunar-solar mutual gravitational attraction. The iterative fitting process adjusts each of the factors independently, yet the net result compensates the counteracting amplitudes so the net common-mode factor is essentially the same for ENSO and NAO (see lower-right correlation labelled Anomalistic+Solar).

Since the NAO has high-frequency components, we can also perform a conventional cross-validation across orthogonal intervals. The validation interval below is for the years between 1960 and 1990, and even though the training intervals were aggressively over-fit, the correlation between the model and data is still visible in those 30 years.

NAO model fit with validation spanning 1960 to 1990

Over the course of time spent modeling ENSO, the effort that went into fitting to NAO was a fraction of the original time. This is largely due to the fact that the temporal lunisolar forcing only needed to be tweaked to match other climate indices, and the iteration over the topological spatial factors quickly converges.

Many more cross-validation techniques are available for NAO, since there are different flavors of NAO indices available corresponding to different Atlantic locations, and spanning back to the 1800’s.

# Last post on ENSO

The last of the ENSO charts.

This is how conventional tidal prediction is done:

Note how well it does in extrapolating a projection from a training interval.

This is an ENSO model fit to SOI data using an analytical solution to Navier-Stokes. The same algorithm is used to solve for the optimal forcing as in the tidal analysis solution above, but applying the annual solar cycle and monthly/fortnightly lunar cycles instead of the diurnal and semi-diurnal cycle.

The time scale transitions from a daily modulation to a much longer modulation due to the long-period tidal factors being invoked.

Next is an expanded view, with the correlation coefficient of 0.73:

This is a fit trained on the 1880-1950 interval (CC=0.76) and cross-validated on the post-1950 data

This is a fit trained on the post-1950 interval (CC=0.77) and cross-validated on the 1880-1950 data

Like conventional tidal prediction, very little over-fitting is observed. Most of what is considered noise in the SOI data is actually the tidal forcing signal. Not much more to say, except for others to refine.

Thanks to Kevin and Keith for all their help, which will be remembered.

click to enlarge

# GC41B-1022: Biennial-Aligned Lunisolar-Forcing of ENSO: Implications for Simplified Climate Models

In the last month, two of the great citizen scientists that I will be forever personally grateful for have passed away. If anyone has followed climate science discussions on blogs and social media, you probably have seen their contributions.

Keith Pickering was an expert on computer science, astrophysics, energy, and history from my neck of the woods in Minnesota. He helped me so much in working out orbital calculations when I was first looking at lunar correlations. He provided source code that he developed and it was a great help to get up to speed. He was always there to tweet any progress made. Thanks Keith

Kevin O’Neill was a metrologist and an analysis whiz from Wisconsin. In the weeks before he passed, he told me that he had extra free time to help out with ENSO analysis. He wanted to use his remaining time to help out with the solver computations. I could not believe the effort he put in to his spreadsheet, and it really motivated me to spending more time in validating the model. He was up all the time working on it because he was unable to lay down. Kevin was also there to promote the research on other blogs, right to the end. Thanks Kevin.

There really aren’t too many people willing to spend time working analysis on a scientific forum, and these two exemplified what it takes to really contribute to the advancement of ideas. Like us, they were not climate science insiders and so will only get credit if we remember them.

# Machine Learning and the Climate Sciences

I’ve been applying equal doses of machine learning (and knowledge based artificial intelligence in general) and physics in my climate research since day one. Next month on December 12, I will be presenting Knowledge-Based Environmental Context Modeling at the AGU meeting which will cover these topics within the earth sciences realm :

Table 1: Technical approach to knowledge-based model building for the earth sciences

In my opinion, machine learning likely will eventually find all the patterns that appear in climate time-series but with various degrees of human assistance.

“Vipin Kumar, a computer scientist at the University of Minnesota in Minneapolis, has used machine learning to create algorithms for monitoring forest fires and assessing deforestation. When his team tasked a computer with learning to identify air-pressure patterns called teleconnections, such as the El Niño weather pattern, the algorithm found a previously unrecognized example over the Tasman Sea.”

In terms of the ENSO pattern, I believe that machine learning through tools such as Eureqa could have found the underlying lunisolar forcing pattern, but would have struggled mightily to break through the complexity barrier. In this case, the complexity barrier is in (1) discovering a biennial modulation which splits all the spectral components and (2) discovering the modifications to the lunar cycles from a strictly sinusoidal pattern.

The way that Eureqa would have found this pattern would be through it’s symbolic regression algorithm (which falls under the first row in Table 1 shown above). It essentially would start it’s machine learning search by testing various combinations of sines and cosines and capturing the most highly correlated combinations for further expansion.   As it expands the combinations, the algorithm would try to reduce complexity by applying trigonometric identities such as this

${displaystyle sin(alpha pm beta )=sin alpha cos beta pm cos alpha sin beta }$

After a while, the algorithm will slow down under the weight of the combinatorial complexity of the search, and then the analyst would need to choose promising candidates from the complexity versus best-fit Pareto front. At this point one would need to apply knowledge of physical laws or mathematical heuristics which would lead to a potentially valid model.

So, in the case of the ENSO model, Eureqa could have discovered the (1) biennial modulation by reducing sets of trigonometric identities, and perhaps by applying a sin(A sin()) frequency modulation (which it is capable of) to discover the (2) second-order modifications to the sinusoidal functions, or (3) it could have been fed a differential equation structure to provide a hint to a solution  …. but, a human got there first by applying prior knowledge of signal processing and of the details in the orbital lunar cycles.

Yet as the Scientific America article suggests, that will likely not be the case in the future when the algorithms continue to improve and update their knowledge base with laws of physics.

This more sophisticated kind of reasoning involves the refined use of the other elements of Table 1.  For example, a more elaborate algorithm could have lifted an entire abstraction level out of a symbolic grouping and thus reduced its complexity. Or it could try to determine whether a behavior was stochastic or deterministic.  The next generation of these tools will be linked to knowledge-bases filled with physics patterns that are organized for searching and reasoning tasks. These will relate the problem under study to potential solutions automatically.

# The ENSO Forcing Potential – Cheaper, Faster, and Better

Following up on the last post on the ENSO forcing, this note elaborates on the math.  The tidal gravitational forcing function used follows an inverse power-law dependence, where a(t) is the anomalistic lunar distance and d(t) is the draconic or nodal perturbation to the distance.

$F(t) propto frac{1}{(R_0 + a(t) + d(t))^2}'$

Note the prime indicating that the forcing applied is the derivative of the conventional inverse squared Newtonian attraction. This generates an inverse cubic formulation corresponding to the consensus analysis describing a differential tidal force:

$F(t) propto -frac{a'(t)+d'(t)}{(R_0 + a(t) + d(t))^3}$

For a combination of monthly and fortnightly sinusoidal terms for a(t) and d(t) (suitably modified for nonlinear nodal and perigean corrections due to the synodic/tropical cycle)   the search routine rapidly converges to an optimal ENSO fit.  It does this more quickly than the harmonic analysis, which requires at least double the unknowns for the additional higher-order factors needed to capture the tidally forced response waveform. One of the keys is to collect the chain rule terms a'(t) and d'(t) in the numerator; without these, the necessary mixed terms which multiply the anomalistic and draconic signals do not emerge strongly.

As before, a strictly biennial modulation needs to be applied to this forcing to capture the measured ENSO dynamics — this is a period-doubling pattern observed in hydrodynamic systems with a strong fundamental (in this case annual) and is climatologically explained by a persistent year-to-year regenerative feedback in the SLP and SST anomalies.

Here is the model fit for training from 1880-1980, with the extrapolated test region post-1980 showing a good correlation.

The geophysics is now canonically formulated, providing (1) a simpler and more concise expression, leading to (2) a more efficient computational solution, (3) less possibility of over-fitting, and (4) ultimately generating a much better correlation. Alternatively, stated in modeling terms, the resultant information metric is improved by reducing the complexity and improving the correlation — the vaunted  cheaper, faster, and better solution. Or, in other words: get the physics right, and all else follows.

# Reverse Engineering the Moon’s Orbit from ENSO Behavior

[mathjax]With an ideal tidal analysis, one should be able to apply the gravitational forcing of the lunar orbit1 and use that as input to solve Laplace’s tidal equations. This would generate tidal heights directly. But due to aleatory uncertainty with respect to other factors, it becomes much more practical to perform a harmonic analysis on the constituent tidal frequencies. This essentially allows an empirical fit to measured tidal heights over a training interval, which is then used to extrapolate the behavior over other intervals.  This works very well for conventional tidal analysis.

For ENSO, we need to make the same decision: Do we attempt to work the detailed lunar forcing into the formulation or do we resort to an empirical bottoms-up harmonic analysis? What we have being do so far is a variation of a harmonic analysis that we verified here. This is an expansion of the lunar long-period tidal periods into their harmonic factors. So that works well. But could a geophysical model work too?