Cross-validation

Cross-validation is essentially the ability to predict the characteristics of an unexplored region based on a model of an explored region. The explored region is often used as a training interval to test or validate model applicability on the unexplored interval. If some fraction of the expected characteristics appears in the unexplored region when the model is extrapolated to that interval, some degree of validation is granted to the model.

This is a powerful technique on its own as it is used frequently (and depended on) in machine learning models to eliminate poorly performing trials. But it gains even more importance when new data for validation will take years to collect. In particular, consider the arduous process of collecting fresh data for El Nino Southern Oscillation, which will take decades to generate sufficient statistical significance for validation.

So, what’s necessary in the short term is substantiation of a model’s potential validity. Nothing else will work as a substitute, as controlled experiments are not possible for domains as large as the Earth’s climate. Cross-validation remains the best bet.

Continue reading

Mf vs Mm

In an earlier post, the observation was that ENSO models may not be unique due to the numerous possibilities provided by nonlinear math. This was supported by the fact that a tidal forcing model based on the Mf (13.66 day) tidal factor worked equally as well as a Mm (27.55 day) factor. This was not surprising considering that the aliasing against an annual impulse gave a similar repeat cycle — 3.8 years versus 3.9 years. But I have also observed that mixing the two in a linear fashion did not improve the fit much at all, as the difference created a long interference cycle which isn’t observed in the ENSO time series data. But then thinking in terms of the nonlinear modulation required, it may be that the two factors can be combined after the LTE solution is applied.

Continue reading

Understanding is Lacking

Regarding the gravity waves concentrically emanating from the Tonga explosion

“It’s really unique. We have never seen anything like this in the data before,” says Lars Hoffmann, an atmospheric scientist at the Jülich Supercomputing Centre in Germany.

https://www.nature.com/articles/d41586-022-00127-1

and

“That’s what’s really puzzling us,” says Corwin Wright, an atmospheric physicist at the University of Bath, UK. “It must have something to do with the physics of what’s going on, but we don’t know what yet.”

https://www.nature.com/articles/d41586-022-00127-1
Hunga-Tonga-Hunga Ha’apai Eruption as seen by AIRS.

The discovery was prompted by a tweet sent to Wright on 15 January from Scott Osprey, a climate scientist at the University of Oxford, UK, who asked: “Wow, I wonder how big the atmospheric gravity waves are from this eruption?!” Osprey says that the eruption might have been unique in causing these waves because it happened very quickly relative to other eruptions. “This event seems to have been over in minutes, but it was explosive and it’s that impulse that is likely to kick off some strong gravity waves,” he says. The eruption might have lasted moments, but the impacts could be long-lasting. Gravity waves can interfere with a cyclical reversal of wind direction in the tropics, Osprey says, and this could affect weather patterns as far away as Europe. “We’ll be looking very carefully at how that evolves,” he says.

https://www.nature.com/articles/d41586-022-00127-1

This (“cyclical reversal of wind direction in the tropics”) is referring to the QBO, and we will see if it has an impact in the coming months. Hint: the QBO from the last post is essentially modeling gravity waves arising from the tidal forcing as driving the cycle. Also, watch the LOD.

Perhaps the lacking is in applying this simple scientific law: for every action there is a reaction. Always start from that, and also consider: an object that is in motion, tends to stay in motion. Is the lack of observed Coriolis effects to first-order part of why the scientists are mystified? Given the variation of this force with latitude, the concentric rings perhaps were expected to be distorted according to spherical harmonics.

Cross-Validation of Geophysics Behaviors

The fit to the ENSO model looks like this

(click on any image to expand)

The forcing spectrum like this, with the aliased draconic (27.212d) factor circled:

For QBO, we remove all the lunar factors except for the draconic, as this is the only declination factor with the same spherical group symmetry as the semi-annual solar declination.

And after modifying the annual (ENSO spring-barrier) impulse into a semi-annual impulse with equal and opposite excursions, the resultant model matches well (to first order) the QBO time series.

Although the alignment isn’t perfect, there are indications in the structure that the fit has a deeper significance. For example, note how many of the shoulders in the structure align, as highlighted below in yellow

The peaks and valleys do wander about a bit and might be a result of the sensitivity to the semi-annual impulse and the fact that this is only a monthly resolution. The chart below is a detailed fit of the QBO using data with a much finer daily resolution. As you can see, slight changes in the seasonal timing of the semi-annual pulse are needed to individually align the 70 and 30 hBar QBO time-series data.

This will require further work, especially in considering recently reported perturbations in the QBO periodicity, but it is telling that a shared draconic forcing of the ENSO and QBO models suggests an important cross-validation of the underlying causal mechanism.

Detailed analysis also shows LTE modulation

Another potential geophysical cross-validation …

The underlying forcing of the ENSO model shows both an 18-year Saros cycle (which is an eclipse alignment cycle of all the tidal periods), along with a 6-year anomalistic/draconic interference cycle. This modulation of the main anomalistic cycle appears in both the underlying daily and monthly profile, shown below before applying an annual impulse. The 6-year is clearly evident as it aligns with the x-axis grid 1880, 1886, 1892, 1898, etc.

Daily profile above, monthly next, both reveal Saros cycle

The bottom inset shows that a similar 6-year cycle consistently appears in length-of-day (LOD) analyses, this particular trace from a recent paper: [ Leonid Zotov et al 2020 J. Phys.: Conf. Ser. 1705 012002 ].

The 6-year cycle in the LOD is not aligned as strictly as the tidal model and it tends to wander, but it seems a more plausible and parsimonious explanation of the modulation than for example in this paper (where the 6-year LOD cycle is “similarly detected in the variations of C22 and S22, the degree-2 order-2 Stokes coefficients of the Earth’s gravitational field”).

Cross-validation confidence improves as the number of mutually agreeing alignments increase. Given the fact that controlled experiments are impossible to perform, this category of analyses is the best way to validate the geophysical models.


Continue reading

“Wobbling” Moon trending on Twitter

Twitter trending topic

This NASA press release has received mainstream news attention.

The 18.6 year nodal cycle will generate higher tides that will exaggerate sea-level rise due to climate change.

Yahoo news item:

https://news.yahoo.com/lunar-orbit-apos-wobble-apos-173042717.html

So this is more-or-less a known behavior, but hopefully it raises awareness to the other work relating lunar forcing to ENSO, QBO, and the Chandler wobble.

Cited paper

Thompson, P.R., Widlansky, M.J., Hamlington, B.D. et al. Rapid increases and extreme months in projections of United States high-tide flooding. Nat. Clim. Chang. 11, 584–590 (2021). https://doi.org/10.1038/s41558-021-01077-8

AO

The Arctic Oscillation (AO) dipole has behavior that is correlated to the North Atlantic Oscillation (NAO) dipole.   We can see this in two ways. First, and most straight-forwardly, the correlation coefficient between the AO and NAO time-series is above 0.6.

Secondly, we can use the model of the NAO from the last post and refit the parameters to the AO data (data also here), but spanning an orthogonal interval. Then we can compare the constituent lunisolar factors for NAO and AO for correlation, and further discover that this also doubles as an effective cross-validation for the underlying LTE model (as the intervals are orthogonal).

Top panel is a model fit for AO between 1900-1950, and below that is a model fit for NAO between 1950-present. The lower pane is the correlation for a common interval (left) and for the constituent lunisolar factors for the orthogonal interval (right)

Only the anomalistic factor shows an imperfect correlation, and that remains quite high.

NAO

The challenge of validating the models of climate oscillations such as ENSO and QBO, rests primarily in our inability to perform controlled experiments. Because of this shortcoming, we can either do (1) predictions of future behavior and validate via the wait-and-see process, or (2) creatively apply techniques such as cross-validation on currently available data. The first is a non-starter because it’s obviously pointless to wait decades for validation results to confirm a model, when it’s entirely possible to do something today via the second approach.

There are a variety of ways to perform model cross-validation on measured data.

In its original and conventional formulation, cross-validation works by checking one interval of time-series against another, typically by training on one interval and then validating on an orthogonal interval.

Another way to cross-validate is to compare two sets of time-series data collected on behaviors that are potentially related. For example, in the case of ocean tidal data that can be collected and compared across spatially separated geographic regions, the sea-level-height (SLH) time-series data will not necessarily be correlated, but the underlying lunar and solar forcing factors will be closely aligned give or take a phase factor. This is intuitively understandable since the two locations share a common-mode signal forcing due to the gravitational pull of the moon and sun, with the differences in response due to the geographic location and local spatial topology and boundary conditions. For tides, this is a consensus understanding and tidal prediction algorithms have stood the test of time.

In the previous post, cross-validation on distinct data sets was evaluated assuming common-mode lunisolar forcing. One cross-validation was done between the ENSO time-series and the AMO time-series. Another cross-validation was performed for ENSO against PDO. The underlying common-mode lunisolar forcings were highly correlated as shown in the featured figure.  The LTE spatial wave-number weightings were the primary discriminator for the model fit. This model is described in detail in the book Mathematical GeoEnergy to be published at the end of the year by Wiley.

Another common-mode cross-validation possible is between ENSO and QBO, but in this case it is primarily in the Draconic nodal lunar factor — the cyclic forcing that appears to govern the regular oscillations of QBO.  Below is the Draconic constituent comparison for QBO and the ENSO.

The QBO and ENSO models only show a common-mode correlated response with respect to the Draconic forcing. The Draconic forcing drives the quasi-periodicity of the QBO cycles, as can be seen in the lower right panel, with a small training window.

This cross-correlation technique can be extended to what appears to be an extremely erratic measure, the North Atlantic Oscillation (NAO).

Like the SOI measure for ENSO, the NAO is originally derived from a pressure dipole measured at two separate locations — but in this case north of the equator.  From the high-frequency of the oscillations, a good assumption is that the spatial wavenumber factors are much higher than is required to fit ENSO. And that was the case as evidenced by the figure below.

ENSO vs NAO cross-validation

Both SOI and NAO are noisy time-series with the NAO appearing very noisy, yet the lunisolar constituent forcings are highly synchronized as shown by correlations in the lower pane. In particular, summing the Anomalistic and Solar constituent factors together improves the correlation markedly, which is because each of those has influence on the other via the lunar-solar mutual gravitational attraction. The iterative fitting process adjusts each of the factors independently, yet the net result compensates the counteracting amplitudes so the net common-mode factor is essentially the same for ENSO and NAO (see lower-right correlation labelled Anomalistic+Solar).

Since the NAO has high-frequency components, we can also perform a conventional cross-validation across orthogonal intervals. The validation interval below is for the years between 1960 and 1990, and even though the training intervals were aggressively over-fit, the correlation between the model and data is still visible in those 30 years.

NAO model fit with validation spanning 1960 to 1990

Over the course of time spent modeling ENSO, the effort that went into fitting to NAO was a fraction of the original time. This is largely due to the fact that the temporal lunisolar forcing only needed to be tweaked to match other climate indices, and the iteration over the topological spatial factors quickly converges.

Many more cross-validation techniques are available for NAO, since there are different flavors of NAO indices available corresponding to different Atlantic locations, and spanning back to the 1800’s.

ENSO, AMO, PDO and common-mode mechanisms

The basis of the ENSO model is the forcing derived from the long-period cyclic lunisolar gravitational pull of the moon and sun. There is some thought that ENSO shows teleconnections to other oceanic behaviors. The primary oceanic dipoles are ENSO and AMO for the Pacific and Atlantic. There is also the PDO for the mid-northern-latitude of the Pacific, which has a pattern distinct from ENSO. So the question is: Are these connected through interactions or do they possibly share a common-mode mechanism through the same lunisolar forcing mechanism?

Based on tidal behaviors, it is known that the gravitational pull varies geographically, so it would be understandable that ENSO, AMO, and PDO would demonstrate distinct time-series signatures. In checking this, you will find that the correlation coefficient between any two of these series is essentially zero, regardless of applied leads or lags. Yet the underlying component factors (the lunar Draconic, lunar Anomalistic, and solar modified terms) may potentially emerge with only slight variations in shape, with differences only in relative amplitude. This is straightforward to test by fitting the basic ENSO model to AMO and PDO by allowing the parameters to vary.

The following figure is the result of fitting the model to ENSO, AMO, and PDO and then comparing the constituent factors.

First, note that the same parametric model fits each of the time series arguably well. The Draconic factor underling both the ENSO and AMO model is almost perfectly aligned, indicated by the red starred graph, with excursions showing a CC above 0.99. All of the rest of the CC’s in fact are above 0.6.

The upshot of this analysis is two-fold. First to consider how difficult it is to fit any one of these time series to a minimal set of periodically-forced signals. Secondly that the underlying signals are not that different in character, only that the combination in terms of a Laplace’s tidal equation weighting are what couples them together via a common-mode mechanism. Thus, the teleconnection between these oceanic indices is likely an underlying common lunisolar tidal forcing, just as one would suspect from conventional tidal analysis.

An obvious clue from tidal data

One of the interesting traits of climate science is the way it gives away obvious clues. This recent paper by Iz

Iz, H Bâki. “The Effect of Regional Sea Level Atmospheric Pressure on Sea Level Variations at Globally Distributed Tide Gauge Stations with Long Records.” Journal of Geodetic Science 8, no. 1 (n.d.): 55–71.
shows such a breathtakingly obvious characteristic that it’s a wonder why everyone isn’t all over it.  The author seems to be understating the feature, which is essentially showing that for certain tidal records, the atmospheric pressure (recorded in the tidal measurement location) is pseudo-quantized to a set of specific values.  In other words, for a New York City tidal gauge station, there are 12 values of atmospheric pressure between 1000 and 1035 mb that are heavily favored over all other values.
One can see it in the raw data here where clear horizontal lines are apparent in the data points:

Raw data for NYC station  (Iz, H Bâki. “The Effect of Regional Sea Level Atmospheric Pressure on Sea Level Variations at Globally Distributed Tide Gauge Stations with Long Records.” Journal of Geodetic Science 8, no. 1 (n.d.): 55–71.)

and for the transformed data shown in the histogram below, where I believe the waviness in the lines is compensated by fitting to long-period tidal signal factors (such as 18.6 year, 9.3 year periods, etc).

Histogram for transformed data for NYC station  Iz, H Bâki. “The Effect of Regional Sea Level Atmospheric Pressure on Sea Level Variations at Globally Distributed Tide Gauge Stations with Long Records.” Journal of Geodetic Science 8, no. 1 (n.d.): 55–71.

The author isn’t calling it a quantization, and doesn’t really call attention to it with a specific name other than clustering, yet it is obvious from the raw data and even more from the histograms of the transformed data.

The first temptation is to attribute the pattern to a measurement artifact. These are monthly readings and there are 12 separate discrete values identified so that connection seems causal. The author says

“It was shown that random component of regional atmospheric pressure tends to cluster at monthly intervals. The clusters are likely to be caused by the intraannual seasonal atmospheric temperature changes, which may also act as random beats in generating sub-harmonics observed in sea level changes as another mechanism.”
Nearer the equator, the pattern is not readily evident. The fundamental connection between tidal value and atmospheric pressure is due to the inverse barometric effect
“At any fixed location, the sea level record is a function of time, involving periodic components as well as continuous random fluctuations. The periodic motion is mostly due to the gravitational effects of the sun-earth-moon system as well as because of solar radiation upon the atmosphere and the ocean as discussed before. Sometimes the random fluctuations are of meteorological origin and reflect the effect of ’weather’ upon the sea surface but reflect also the inverse barometric effect of atmospheric pressure at sea level.”
So the bottom-line impact is that the underlying tidal signal is viably measured even though it is at a monthly resolution and not the diurnal or semi-diurnal resolution typically associated with tides.
Why this effect is not as evident closer to the equator is rationalized by smaller annual amplification
“Stations closer to the equator are also exposed to yearly periodic variations but with smaller amplitudes. Large adjusted R2 values show that the models explain most of the variations in atmospheric pressure  observed at the sea level at the corresponding stations. For those stations closer to the equator, the amplitudes of the annual and semiannual changes are considerably smaller and overwhelmed by random excursions. Stations in Europe experience similar regional variations because of their proximities to each other”
So, for the Sydney Harbor tidal data the pattern is not observed

Sydney histogram does not show a clear delineated quantization

Whereas, I previously showed the clear impact of the ENSO signal on the Sydney tidal data after a specific transform in this post. The erratic ENSO signal (with a huge inverse barometric effect as measured via the SOI readings of atmospheric pressure) competes with the annual signal so that the monthly quantization is obscured. Yet, if the ENSO behavior is also connected to the tidal forcing at these long-period levels, there may be a tidal unification yet to be drawn from these clues.