The challenge of validating the models of climate oscillations such as ENSO and QBO, rests primarily in our inability to perform controlled experiments. Because of this shortcoming, we can either do (1) predictions of future behavior and validate via the wait-and-see process, or (2) creatively apply techniques such as cross-validation on currently available data. The first is a non-starter because it’s obviously pointless to wait decades for validation results to confirm a model, when it’s entirely possible to do something today via the second approach.

There are a variety of ways to perform model cross-validation on measured data.

In its original and conventional formulation, cross-validation works by checking one interval of time-series against another, typically by training on one interval and then validating on an orthogonal interval.

Another way to cross-validate is to compare two sets of time-series data collected on behaviors that are potentially related. For example, in the case of ocean tidal data that can be collected and compared across spatially separated geographic regions, the sea-level-height (SLH) time-series data will not necessarily be correlated, but the underlying lunar and solar forcing factors will be closely aligned give or take a phase factor. This is intuitively understandable since the two locations share a common-mode signal forcing due to the gravitational pull of the moon and sun, with the differences in response due to the geographic location and local spatial topology and boundary conditions. For tides, this is a consensus understanding and tidal prediction algorithms have stood the test of time.

In the previous post, cross-validation on distinct data sets was evaluated assuming common-mode lunisolar forcing. One cross-validation was done between the ENSO time-series and the AMO time-series. Another cross-validation was performed for ENSO against PDO. The underlying common-mode lunisolar forcings were highly correlated as shown in the featured figure.  The LTE spatial wave-number weightings were the primary discriminator for the model fit. This model is described in detail in the book Mathematical GeoEnergy to be published at the end of the year by Wiley.

Another common-mode cross-validation possible is between ENSO and QBO, but in this case it is primarily in the Draconic nodal lunar factor — the cyclic forcing that appears to govern the regular oscillations of QBO.  Below is the Draconic constituent comparison for QBO and the ENSO.

The QBO and ENSO models only show a common-mode correlated response with respect to the Draconic forcing. The Draconic forcing drives the quasi-periodicity of the QBO cycles, as can be seen in the lower right panel, with a small training window.

This cross-correlation technique can be extended to what appears to be an extremely erratic measure, the North Atlantic Oscillation (NAO).

Like the SOI measure for ENSO, the NAO is originally derived from a pressure dipole measured at two separate locations — but in this case north of the equator.  From the high-frequency of the oscillations, a good assumption is that the spatial wavenumber factors are much higher than is required to fit ENSO. And that was the case as evidenced by the figure below.

ENSO vs NAO cross-validation

Both SOI and NAO are noisy time-series with the NAO appearing very noisy, yet the lunisolar constituent forcings are highly synchronized as shown by correlations in the lower pane. In particular, summing the Anomalistic and Solar constituent factors together improves the correlation markedly, which is because each of those has influence on the other via the lunar-solar mutual gravitational attraction. The iterative fitting process adjusts each of the factors independently, yet the net result compensates the counteracting amplitudes so the net common-mode factor is essentially the same for ENSO and NAO (see lower-right correlation labelled Anomalistic+Solar).

Since the NAO has high-frequency components, we can also perform a conventional cross-validation across orthogonal intervals. The validation interval below is for the years between 1960 and 1990, and even though the training intervals were aggressively over-fit, the correlation between the model and data is still visible in those 30 years.

NAO model fit with validation spanning 1960 to 1990

Over the course of time spent modeling ENSO, the effort that went into fitting to NAO was a fraction of the original time. This is largely due to the fact that the temporal lunisolar forcing only needed to be tweaked to match other climate indices, and the iteration over the topological spatial factors quickly converges.

Many more cross-validation techniques are available for NAO, since there are different flavors of NAO indices available corresponding to different Atlantic locations, and spanning back to the 1800’s.

ENSO, AMO, PDO and common-mode mechanisms

The basis of the ENSO model is the forcing derived from the long-period cyclic lunisolar gravitational pull of the moon and sun. There is some thought that ENSO shows teleconnections to other oceanic behaviors. The primary oceanic dipoles are ENSO and AMO for the Pacific and Atlantic. There is also the PDO for the mid-northern-latitude of the Pacific, which has a pattern distinct from ENSO. So the question is: Are these connected through interactions or do they possibly share a common-mode mechanism through the same lunisolar forcing mechanism?

Based on tidal behaviors, it is known that the gravitational pull varies geographically, so it would be understandable that ENSO, AMO, and PDO would demonstrate distinct time-series signatures. In checking this, you will find that the correlation coefficient between any two of these series is essentially zero, regardless of applied leads or lags. Yet the underlying component factors (the lunar Draconic, lunar Anomalistic, and solar modified terms) may potentially emerge with only slight variations in shape, with differences only in relative amplitude. This is straightforward to test by fitting the basic ENSO model to AMO and PDO by allowing the parameters to vary.

The following figure is the result of fitting the model to ENSO, AMO, and PDO and then comparing the constituent factors.

First, note that the same parametric model fits each of the time series arguably well. The Draconic factor underling both the ENSO and AMO model is almost perfectly aligned, indicated by the red starred graph, with excursions showing a CC above 0.99. All of the rest of the CC’s in fact are above 0.6.

The upshot of this analysis is two-fold. First to consider how difficult it is to fit any one of these time series to a minimal set of periodically-forced signals. Secondly that the underlying signals are not that different in character, only that the combination in terms of a Laplace’s tidal equation weighting are what couples them together via a common-mode mechanism. Thus, the teleconnection between these oceanic indices is likely an underlying common lunisolar tidal forcing, just as one would suspect from conventional tidal analysis.

An obvious clue from tidal data

One of the interesting traits of climate science is the way it gives away obvious clues. This recent paper by Iz

Iz, H Bâki. “The Effect of Regional Sea Level Atmospheric Pressure on Sea Level Variations at Globally Distributed Tide Gauge Stations with Long Records.” Journal of Geodetic Science 8, no. 1 (n.d.): 55–71.
shows such a breathtakingly obvious characteristic that it’s a wonder why everyone isn’t all over it.  The author seems to be understating the feature, which is essentially showing that for certain tidal records, the atmospheric pressure (recorded in the tidal measurement location) is pseudo-quantized to a set of specific values.  In other words, for a New York City tidal gauge station, there are 12 values of atmospheric pressure between 1000 and 1035 mb that are heavily favored over all other values.
One can see it in the raw data here where clear horizontal lines are apparent in the data points:

Raw data for NYC station  (Iz, H Bâki. “The Effect of Regional Sea Level Atmospheric Pressure on Sea Level Variations at Globally Distributed Tide Gauge Stations with Long Records.” Journal of Geodetic Science 8, no. 1 (n.d.): 55–71.)

and for the transformed data shown in the histogram below, where I believe the waviness in the lines is compensated by fitting to long-period tidal signal factors (such as 18.6 year, 9.3 year periods, etc).

Histogram for transformed data for NYC station  Iz, H Bâki. “The Effect of Regional Sea Level Atmospheric Pressure on Sea Level Variations at Globally Distributed Tide Gauge Stations with Long Records.” Journal of Geodetic Science 8, no. 1 (n.d.): 55–71.

The author isn’t calling it a quantization, and doesn’t really call attention to it with a specific name other than clustering, yet it is obvious from the raw data and even more from the histograms of the transformed data.

The first temptation is to attribute the pattern to a measurement artifact. These are monthly readings and there are 12 separate discrete values identified so that connection seems causal. The author says

“It was shown that random component of regional atmospheric pressure tends to cluster at monthly intervals. The clusters are likely to be caused by the intraannual seasonal atmospheric temperature changes, which may also act as random beats in generating sub-harmonics observed in sea level changes as another mechanism.”
Nearer the equator, the pattern is not readily evident. The fundamental connection between tidal value and atmospheric pressure is due to the inverse barometric effect
“At any fixed location, the sea level record is a function of time, involving periodic components as well as continuous random fluctuations. The periodic motion is mostly due to the gravitational effects of the sun-earth-moon system as well as because of solar radiation upon the atmosphere and the ocean as discussed before. Sometimes the random fluctuations are of meteorological origin and reflect the effect of ’weather’ upon the sea surface but reflect also the inverse barometric effect of atmospheric pressure at sea level.”
So the bottom-line impact is that the underlying tidal signal is viably measured even though it is at a monthly resolution and not the diurnal or semi-diurnal resolution typically associated with tides.
Why this effect is not as evident closer to the equator is rationalized by smaller annual amplification
“Stations closer to the equator are also exposed to yearly periodic variations but with smaller amplitudes. Large adjusted R2 values show that the models explain most of the variations in atmospheric pressure  observed at the sea level at the corresponding stations. For those stations closer to the equator, the amplitudes of the annual and semiannual changes are considerably smaller and overwhelmed by random excursions. Stations in Europe experience similar regional variations because of their proximities to each other”
So, for the Sydney Harbor tidal data the pattern is not observed

Sydney histogram does not show a clear delineated quantization

Whereas, I previously showed the clear impact of the ENSO signal on the Sydney tidal data after a specific transform in this post. The erratic ENSO signal (with a huge inverse barometric effect as measured via the SOI readings of atmospheric pressure) competes with the annual signal so that the monthly quantization is obscured. Yet, if the ENSO behavior is also connected to the tidal forcing at these long-period levels, there may be a tidal unification yet to be drawn from these clues.


Experiment to compare training runs from 1880 to 1980 of the ENSO model against both the NINO34 time-series data and the SOI data. The solid red-curves are the extrapolated cross-validation interval..



Many interesting inferences one can potentially draw from these comparisons. The SOI signal appears more noisy, but that could actually be signal. For example, the NINO34 extrapolation pulls out a split peak near 2013-2014, which does show up in the SOI data. And a discrepancy in the NINO34 data near 1934-1935 which predicts a minor peak, is essentially noise in the SOI data.  The 1984-1986 flat valley region is much lower in NINO34 than in SOI, where it hovers around 0. The model splits the difference in that interval, doing a bit of both. And the 1991-1992 valley predicted in the model is not clear in the NINO34 data, but does show up in the SOI data.

Of course these are subjectively picked samples, yet there may be some better combination of SOI and NINO34 that one can conceive of to get a better handle on the true ENSO signal.

Continue reading

The ENSO Forcing Potential – Cheaper, Faster, and Better

Following up on the last post on the ENSO forcing, this note elaborates on the math.  The tidal gravitational forcing function used follows an inverse power-law dependence, where a(t) is the anomalistic lunar distance and d(t) is the draconic or nodal perturbation to the distance.

F(t) propto frac{1}{(R_0 + a(t) + d(t))^2}'

Note the prime indicating that the forcing applied is the derivative of the conventional inverse squared Newtonian attraction. This generates an inverse cubic formulation corresponding to the consensus analysis describing a differential tidal force:

F(t) propto -frac{a'(t)+d'(t)}{(R_0 + a(t) + d(t))^3}

For a combination of monthly and fortnightly sinusoidal terms for a(t) and d(t) (suitably modified for nonlinear nodal and perigean corrections due to the synodic/tropical cycle)   the search routine rapidly converges to an optimal ENSO fit.  It does this more quickly than the harmonic analysis, which requires at least double the unknowns for the additional higher-order factors needed to capture the tidally forced response waveform. One of the keys is to collect the chain rule terms a'(t) and d'(t) in the numerator; without these, the necessary mixed terms which multiply the anomalistic and draconic signals do not emerge strongly.

As before, a strictly biennial modulation needs to be applied to this forcing to capture the measured ENSO dynamics — this is a period-doubling pattern observed in hydrodynamic systems with a strong fundamental (in this case annual) and is climatologically explained by a persistent year-to-year regenerative feedback in the SLP and SST anomalies.

Here is the model fit for training from 1880-1980, with the extrapolated test region post-1980 showing a good correlation.

The geophysics is now canonically formulated, providing (1) a simpler and more concise expression, leading to (2) a more efficient computational solution, (3) less possibility of over-fitting, and (4) ultimately generating a much better correlation. Alternatively, stated in modeling terms, the resultant information metric is improved by reducing the complexity and improving the correlation — the vaunted  cheaper, faster, and better solution. Or, in other words: get the physics right, and all else follows.














ENSO model for predicting El Nino and La Nina events

Applying the ENSO model to predict El Nino and La Nina events is automatic. There are no adjustable parameters apart from the calibrated tidal forcing amplitudes and phases used in the process of fitting over the training interval. Therefore the cross-validated interval from 1950 to present is untainted during the fitting process and so can be used as a completely independent and unbiased test.

Continue reading

Millennium Prize Problem: Navier-Stokes

Watched the hokey movie Gifted on a plane ride. Turns out that the Millennium Prize for mathematically solving the Navier-Stokes problem plays into the plot.

I am interested in variations of the Navier–Stokes equations that describe hydrodynamical flow on the surface of a sphere.  The premise is that such a formulation can be used to perhaps model ENSO and QBO.

The so-called primitive equations are the starting point, as these create constraints for the volume geometry (i.e. vertical motion much smaller than horizontal motion and fluid layer depth small compared to Earth’s radius). From that, we go to Laplace’s tidal equations, which are a linearization of the primitive equations.

I give a solution here, which was originally motivated by QBO.

Of course the equations are under-determined, so the only hope I had of solving them is to provide this simplifying assumption:

{frac{partialzeta}{partialvarphi} = frac{partialzeta}{partial t}frac{partial t}{partialvarphi}}

If you don’t believe that this partial differential coupling of a latitudinal forcing to a tidal response occurs, then don’t go further. But if you do, then:





Solar Eclipse 2017 : What else?

The reason we can so accurately predict the solar eclipse of 2017 is because we have accurate knowledge of the moon’s orbit around the earth and the earth’s orbit around the sun.

Likewise, the reason that we could potentially understand the behavior of the El Nino Southern Oscillation (ENSO) is that we have knowledge of these same orbits. As we have shown and will report at this year’s American Geophysical Union (AGU) meeting, the cyclic gravitational pull of the moon (lower panel in Figure 1 below) interacting seasonally precisely controls the ENSO cycles (upper panel Figure 1).

Fig 1: Training interval 1880-1950 leads to extrapolated fit post-1950

Figure 2 is how sensitive the fit is to the precise value of the lunar cycle periods. Compare the best ft values to the known lunar values here. This is an example of the science of metrology.

Fig 2: Sensitivity to selection of lunar periods.

The implications of this research are far-ranging. Like knowing when a solar eclipse occurs helps engineers and scientists prepare power utilities and controlled climate experiments for the event, the same considerations apply to ENSO.  Every future El Nino-induced heat-wave or monsoon could conceivably be predicted in advance, giving nations and organizations time to prepare for accompanying droughts, flooding, and temperature extremes.

Follow @whut on Twitter:


ENSO Split Training for Cross-Validation

If we split the modern ENSO data into two training intervals — one from 1880 to 1950 and one from 1950 to 2016, we get roughly equal-length time series for model evaluation.

As Figure 1 shows, a forcing stimulus due to monthly-range LOD variations calibrated to the interval between 2000 to 2003 (lower panel) is used to train the ENSO model in the interval from 1880 to 1950. The extrapolated model fit in RED does a good job in capturing the ENSO data in the period beyond 1950.

Fig. 1: Training 1880 to 1950

Next, we reverse the training and verification fit, using the period from 1950 to 2016 as the training interval and then back extrapolating. Figure 2 shows this works about as well.

Fig. 2: Training interval 1950 to 2016

Continue reading

Deterministic and Stochastic Applied Physics

Pierre-Simon Laplace was one of the first mathematicians who took an interest in problems of probability and determinism.  It’s surprising how much of the math and applied physics that Laplace developed gets used in day-to-day analysis. For example, while working on the ENSO and QBO analysis, I have invoked the following topics at some point:

  1. Laplace’s tidal equations
  2. Laplace’s equation
  3. Laplacian differential operator
  4. Laplace transform
  5. Difference equation
  6. Planetary and lunar orbital perturbations
  7. Probability methods and problems
    1. Inductive probability
    2. Bayesian analysis, e.g. the Sunrise problem
  8. Statistical methods and applications
    1. Central limit theorem
    2. Least squares
  9. Filling in holes of Newton’s differential calculus
  10. Others here

Apparently he did so much and was so comprehensive that in some of his longer treatises he often didn’t cite the work of others, making it difficult to pin down everything he was responsible for (evidently he did have character flaws).

In any case, I recall applying each of the above in working out some aspect of a problem. Missing was that Laplace didn’t invent Fourier analysis but the Laplace transform is close in approach and utility.

When Laplace did all this research, he must have possessed insight into what constituted deterministic processes:

We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.

— Pierre Simon Laplace,
A Philosophical Essay on Probabilities[wikipedia]
This is summed up as:

He also seemed to be a very applied mathematician, as per a quote I have used before  “Probability theory is nothing but common sense reduced to calculation.”  Really nothing the least bit esoteric about any of Laplace’s math, as it seemed always motivated by solving some physics problem or scientific observation. It appears that he wanted to explain all these astronomic and tidal problems in as simple a form as possible. Back then it may have been esoteric, but not today as his techniques have become part of the essential engineering toolbox. I have to wonder if Laplace were alive now whether he would agree that geophysical processes such as ENSO and QBO were equally as deterministic as the sun rising every morning or of the steady cyclic nature of the planetary and lunar orbits. And it wasn’t as if Laplace possessed confirmation bias that behaviors were immediately deterministic; as otherwise he wouldn’t have spent so much effort in devising the rules of probability and statistics that are still in use today, such as the central limit theorem and least squares.

Perhaps he would have glanced at the ENSO problem for a few moments, noticed that in no way that it was random, and then casually remarked with one his frequent idiomatic phrases:

Il est aisé à voir que…”  … or ..  (“It is easy to see that…”).

It may have been so obvious that it wasn’t important to give the details at the moment, only to fill in the chain of reasoning later.  Much like the contextEarth model for QBO, deriving from Laplace’s tidal equations.

Where are the Laplace’s of today that are willing to push the basic math and physics of climate variability as far as it will take them? It has seemingly jumped from Laplace to Lorenz and then to chaotic uncertainty ala Tsonis or mystifying complexity ala Lindzen. Probably can do much better than to punt like that … on first down even !