Tidal Gauge Differential

A climate science breakthrough likely won’t be on some massive computation but on a novel formulation that exposes some fundamental pattern (perhaps discovered by deep mining during a machine learning exercise). Over 10 years ago, I wrote on a blog post on how one can extract the ENSO signal by doing simple signal processing on a sea-level height (SLH) tidal time-series — in this case, at Fort Denison located in Sydney harbor.

The formulation/trick is to take the difference between the SLH reading and that from 2 years (24 months) prior, described here

Check the recent blog post Lunar Torque Controls All for context of how it fits in to the unified model.

The rationale for this 24 month difference is likely related to the sloshing of the ocean triggered on an annual basis. I think this is a pattern that any ML exercise would find with very little effort. After all, it didn’t take me that long to find it. But the point is that the ML configuration has to be open and flexible enough to be able to search, generate, and test for the same formulation. IOW, it may not find it if the configuration, perhaps focused on computationally massive PDEs, is too narrow. That was my comment to a RC post on applying machine learning to climate science, see the following link and subsequent quote:

Nick McGreivy commented on:

“ML-based parameterizations have to work well for thousands of years of simulations, and thus need to be very stable (no random glitches or periodic blow-ups) (harder than you might think). Bias corrections based on historical observations might not generalize correctly in the future.”

This same issue arises when using ML to simulate PDEs. The solution is to analytically calculate what the stability condition(s) is (are), then at each timestep to add some numerical diffusion that nudges the solution towards satisfying the stability condition(s). I imagine this same technique could be used for ML-based parametrizations.

QBO Metrics

In addition to the standard correlation coefficient (CC) and RMS error, non-standard metrics that have beneficial cross-validation properties include dynamic time warp (DTW), complexity invariant-distance (CID) see [2], and a CID-modified DTW. The link above describes my implementation of the DTW metric but I have yet to describe the CID metric. It’s essentially the CC multiplied by a factor that empirically adjusts the embedded summed distance between data points (i.e. the stretched length) of the time-series so that the signature or look of two time-series visually match in complexity.

   CID = CC * min(Length(Model, Data))/ max(Length(Model, Data))

The authors of the CID suggest that it’s a metric based on “an invariance that the community seems to have missed”.

And a CID-modified DTW is thus:

CID = DTW * min(Length(Model, Data))/ max(Length(Model, Data))

I have tried this on the QBO model with good cross-validation results featuring up to-data data from https://www.atmohub.kit.edu/data/qbo.dat

These have similar tidal factor compositions and differ mainly in the LTE modulation and phase delay. As discussed earlier, any anomalies in the QBO behavior are likely the outcome of an erratic periodicity caused by incommensurate annual and draconic cycles and exaggerated by LTE.

from https://gist.github.com/pukpr/e562138af3a9da937a3fb6955685c98f

REFERENCES

[1] Batista, Gustavo EAPA, et al. “CID: an efficient complexity-invariant distance for time series.” Data Mining and Knowledge Discovery 28 (2014): 634-669.R
https://link.springer.com/article/10.1007/s10618-013-0312-3

Low Dimensions

A key enabling assumption, sometimes called the manifold hypothesis [14], is that the data lie on or near a low-dimensional manifold; for physical systems with dissipation, such manifolds can often be rigorously shown to exist [15–18]. These manifolds enable a low-dimensional latent state representation, and hence, low-dimensional dynamical models. Linear manifold learning techniques, such as principal component analysis, cannot learn the nonlinear manifolds that represent most systems in nature. To do so, we require nonlinear methods, some of which are developed in [19–25] and reviewed in [26].

Floryan, Daniel, and Michael D. Graham. “Data-driven discovery of intrinsic dynamics.” Nature Machine Intelligence 4.12 (2022): 1113-1120.

From <https://www.nature.com/articles/s42256-022-00575-4>


GC22A-04 Can ML beats chaos?

Abstract

Chaos is typically blamed for the lack of predictability beyond a forecasting time window, which is on the order of 10 days for weather forecasting. However, on the one hand, most turbulent and chaotic systems exhibit strong coherence in the flow, such as synoptic events in weather or coherent structures in turbulence. On the other hand, most physical model might have additional structural errors that limit their capacity to correctly forecast beyond a certain time horizon, independent of chaos.

We will show in this presentation that most chaotic and turbulent flows can be predicted on relatively long range, at least longer than expected with both physical models and standard deep learning, using a combination of a reduced order model (that captures the low-dimensional coherent structures in the flow) and generative AI to obtain the crisp results and details of the flow. We will conclude stating that current AI-based weather models might not have achieved a plateau in performance yet, especially at longer time scales, and that physics-based weather model still have room for improvements. Reduced order models might not be able to beat chaos but can lead to much longer-range prediction than currently expected.

https://agu.confex.com/agu/agu24/meetingapp.cgi/Paper/1522150

20yrs of blogging in hindsight

Reminded by a 20-year anniversary post at RealClimate.org, that I’ve been blogging for 20 years + 6 months on topics of fossil fuel depletion + climate change. The starting point was at a BlogSpot blog I created in May 2004, where the first post set the stage:


Click on the above to go to the complete archives (almost daily posts) until I transitioned to WordPress and what became the present blog. After 2011, my blogging pace slowed down considerably as I started to write in more in more technical terms. Eventually the most interesting and novel posts were filtered down to a set that would eventually become the contents of Mathematical Geoenergy : Discovery, Depletion, and Renewal, published in late 2018/early 2019 by Wiley with an AGU imprint.

The arc that my BlogSpot/WordPress blogging activity followed occupies somewhat of a mirror universe to that of RealClimate. I initially started out with an oil depletion focus and by incrementally understanding the massive inertia that our FF-dependent society had developed, it placed the climate science aspect into a different perspective and context. After realizing that CO2 did not like to sequester, it became obvious that not much could be done to mitigate the impact of gradually increasing GHG levels, and that it would evolve into a slow-moving train wreck. That’s part of the reason why I focused more on research into natural climate variability. In contrast, RealClimate (and all the other climate blogs) continued to concentrate on man-made climate change. At this point, my climate fluid dynamics understanding is at some alternate reality level, see the last post, still very interesting but lacking any critical acceptance (no debunking either, which keeps it alive and potentially valid).

The oil depletion aspect more-or-less spun off into the PeakOilBarrel.com blog [*] maintained by my co-author Dennis Coyne. That’s like watching a slow-moving train wreck as well, but Dennis does an excellent job of keeping the suspense up with all the details in the technical modeling. Most of the predictions regarding peak oil that we published in 2018 are panning out.

As a parting thought, the RealClimate hindsight post touched on how AI will impact information flow going forward. Having worked on AI knowledgebases for environmental modeling during the LLM-precursor stage circa 2010-2013, I can attest that it will only get better. At the time, we were under the impression that knowledge used for modeling should be semantically correct and unambiguous (with potentially a formal representation and organization, see figure below), and so developed approaches for that here and here (long report form).


As it turned out, lack of correctness is just a stage, and AI users/customers are satisfied to get close-enough for many tasks. Eventually, the LLM robots will gradually clean up the sources of knowledge and converge more to semantic correctness. Same will happen with climate models as machine learning by the big guns at Google, NVIDIA, and Huawei will eventually discover what we have found in this blog over the course of 20+ years.

Note:
[*] In some ways the PeakOilBarrel.com blog is a continuation of the shuttered TheOilDrum.com blog, which closed shop in 2013 for mysterious reasons.

Lunar Torque Controls All

Mathematical Geoenergy

The truly massive scale in the motion of fluids and solids on Earth arises from orbital interactions with our spinning planet. The most obvious of these, such as the daily and seasonal cycles, are taken for granted. Others, such as ocean tides, have more complicated mechanisms than the ordinary person realizes (e.g. ask someone to explain why there are 2 tidal cycles per day). There are also less well-known motions, such as the variation in the Earth’s rotation rate of nominally 360° per day, which is called the delta in Length of Day (LOD), and in the slight annual wobble in the Earth’s rotation axis. Nevertheless, each one of these is technically well-characterized and models of the motion include a quantitative mapping to the orbital cycles of the Sun, Moon, and Earth. This is represented in the directed graph below, where the BLUE ovals indicate behaviors that are fundamentally understood and modeled via tables of orbital factors.

The cyan background represents behaviors that have a longitudinal dependence
(rendered by GraphViz
)

However, those ovals highlighted in GRAY are nowhere near being well-understood in spite of being at least empirically well-characterized via years of measurements. Further, what is (IMO) astonishing is the lack of research interest in modeling these massive behaviors as a result of the same orbital mechanisms as that which causes tides, seasons, and the variations in LOD. In fact, everything tagged in the chart is essentially a behavior relating to an inertial response to something. That something, as reported in the Earth sciences literature, is only vaguely described — and never as a tidal or tidal/annual interaction.

I don’t see how it’s possible to overlook such an obvious causal connection. Why would the forcing that causes a massive behavior such as tides suddenly stop having a connection to other related inertial behaviors? The answers I find in the research literature are essentially that “someone looked in the past and found no correlation” [1].

Continue reading

NAO and the Median Filter

a random quote

“LLMs run a median filter on the corpus”

The North Atlantic Oscillation (NAO) time-series has always been intimidating to analyze. It appears outwardly like white-noise, but enough scientists refer to long periods of positive or negative-leaning intervals that there must be more of an underlying autocorrelation embedded in it. To reduce the white noise and to extract the signal requires a clever filtering trick. The desire is that the filter preserve or retain the edges of the waveform while reducing the noise level. The 5-point median filter does exactly that at a minimal complexity level, as the algorithm is simply expressed. It will leave edges and steep slopes alone as the median will naturally occur on the slope. That’s just what we want to retain the underlying signal.

Once applied, the NAO time-series still appears erratic, yet the problematic monthly extremes are reduced with the median filter suppressing many of them completely. If we then use the LTE model to fit the NAO time-series (with a starting point the annual-impulsed tidal forcing used for the AMO), a clear correlation emerges. The standing-wave modes of course are all high, in contrast to the one low mode for the AMO, so the higher frequency cycling is expected yet the fit is surprisingly good for the number of peaks and valleys it must traverse in the 140+ year historical interval.

Non-autonomous mathematical models

Non-autonomous mathematical formulations differ from autonomous ones in that their governing equations explicitly depend on time or another external variable. In natural systems, certain behaviors or processes are better modeled with non-autonomous formulations because they are influenced by external, time-dependent factors. Some examples of natural behaviors that qualify as non-autonomous include:

1. Seasonal Climate Variation:
Climate patterns, such as temperature changes or monsoon cycles, are influenced by external factors like the Earth’s orbit, axial tilt, and solar radiation, all of which vary over time. These changes make non-autonomous systems suitable for modeling long-term climate behavior.

2. Tidal Forces:
Tidal movements are driven by the gravitational pull of the Moon and the Sun, which vary as the positions of these celestial bodies change relative to Earth. Tidal equations thus have time-dependent forcing terms, making them non-autonomous.

3. Biological Rhythms:
Circadian rhythms in living organisms, which regulate daily cycles such as sleep and feeding, are influenced by the 24-hour light-dark cycle. These external light variations necessitate non-autonomous models.

4. Astronomical Geophysical Cycles:
Systems like the Chandler wobble (the irregular movement of Earth’s rotation axis) or the Quasi-Biennial Oscillation (QBO) in the equatorial stratosphere are influenced by periodic external factors, such as lunar cycles, making them non-autonomous. This also includes systems where lunar or Draconic cycles interact with annual cycles in non-linear ways, as explored in studies of Earth’s rotational dynamics.

5. Oceanographic Dynamical Phenomena:
Non-autonomous formulations are needed to model phenomena such as El Niño, which is influenced by complex interactions between atmospheric and oceanic conditions, themselves driven by seasonal and longer-term climatic variations.

6. Planetary Motion in a Varying Gravitational Field:
In astrophysical systems where a planet moves in the gravitational field of other bodies, such as a multi-body problem where external forces vary in time, non-autonomous dynamics become essential to account for these influences.

In contrast, autonomous systems are self-contained and their behavior depends only on their internal state variables, independent of any external time-varying influence. So that non-autonomous systems often better capture the complexity and variability introduced by time-dependent external factors.

However, many still want to find connections to autonomous formulations as they often coincide with resonant conditions or some natural damping rate.

Autonomous mathematical formulations are characterized by the fact that their governing equations do not explicitly depend on time or other external variables (they can be implicit via time derivatives though). These systems evolve based solely on their internal state variables. Many natural behaviors can be modeled using autonomous systems when external influences are either negligible or can be ignored. Here are some examples of natural behaviors that qualify as autonomous:

1. Radioactive Decay:
The decay of radioactive isotopes is governed by an internal process where the rate of decay depends only on the amount of the substance present at a given moment. The decay equation does not depend on time explicitly, making it an autonomous system.

2. Epidemiological Models (without external intervention):
Simplified models of disease spread, such as the SIR (Susceptible-Infected-Recovered) model, can be autonomous if no external factors (like seasonal effects or interventions) are considered. The evolution of the system depends only on the current number of susceptible, infected, and recovered individuals.

3. Predator-Prey Dynamics (Lotka-Volterra Model):
In the absence of external influences like seasonal changes or human intervention, predator-prey relationships, such as those described by the Lotka-Volterra equations, can be modeled as autonomous systems. The population changes depend solely on the interaction between predators and prey.

4. Chemical Reactions (closed systems):
In a closed system with no external input or removal of substances, the kinetics of chemical reactions can be modeled as autonomous. The rate of reaction depends only on the concentrations of reactants and products at any given time.

5. Newtonian Mechanics of Isolated Systems:
For an isolated mechanical system (e.g., a simple pendulum or two-body orbital system), the equations of motion can be autonomous. The system evolves based solely on the internal energy and forces within the system, without any external time-dependent influences. This relates to general oscillatory systems or harmonic oscillators — the simple harmonic oscillator (such as a mass on a spring) can be modeled autonomously if no external time-varying forces are acting on the system. The system’s behavior depends only on its position and velocity at any point in time. In the classical gravitational two-body problem in celestial mechanics, where two bodies interact only through their mutual gravitational attraction, the motion can be described autonomously. The positions and velocities of the two bodies determine their future motion, independent of any external time-dependent factors.

6. Thermodynamics of Isolated Systems:
In an isolated thermodynamic system, where there is no exchange of energy or matter with the surroundings, the internal state (e.g., pressure, temperature, volume) evolves autonomously based on the system’s internal conditions.

These examples illustrate systems where internal dynamics govern the evolution of the system, and time or external influences do not explicitly appear in the equations. However, in many real-world cases, external factors often come into play, making non-autonomous formulations more appropriate for capturing the full complexity of natural behaviors. A pendulum that is periodically synchronized as for example a child pushed on a swing set,  may be either formulated as a forced response in an autonomous set of equations or a non-autonomous description if the swing pusher carefully guides the cycle.

This is where the distinctions between autonomous vs non-autonomous and forced vs natural responses should be elaborated.

Understanding the Structure of the General Solution

In the case of a forced linear second-order dynamical system, the general solution to the system is typically the sum of two components:

Homogeneous (natural) solution: This is the solution to the system when there is no external forcing (i.e., the forcing term is zero).

Particular solution: This is the solution driven by the external forcing.

The homogeneous solution depends only on the internal properties of the system (such as natural frequency, damping, etc.) and is the solution when F(t) = 0.

The particular solution is directly related to the forcing function F(t), which can be time-dependent in the case of a non-autonomous system.

So let’s  consider the autonomous vs non-autonomous context.

Autonomous System: In an autonomous system, even though the system is subject to forcing, the forcing term does not explicitly depend on time but rather on internal state variables (such as x or dx/dt). Here, the particular solution would also be state-dependent and would not explicitly involve time as an independent variable.

Non-Autonomous System: In a non-autonomous system, the forcing term explicitly depends on time, such as F(t) = A sin(w t). This external time-dependent forcing drives the particular solution. While the homogeneous solution remains autonomous (since it’s based on the system’s internal properties), the particular solution reflects the non-autonomous nature of the system.

The key insight is that of the non-autonomous particular solution. Even though a system’s response can have components from the homogeneous solution (which are autonomous in nature), the particular solution in a non-autonomous system will be time-dependent and follow the time-dependence of the external forcing.

So consider the transition from autonomous to non-autonomous: when you introduce a periodic forcing function F(t), the particular solution becomes non-autonomous, even though the overall system response still includes the autonomous homogeneous solution. This results in the system being classified as non-autonomous, as the particular solution carries the time-dependent behavior, despite the autonomous structure of the homogeneous solution.

Summary: A forced response in a linear second-order system can include both autonomous and non-autonomous components. Even though the homogeneous solution remains autonomous, the particular solution introduces non-autonomous characteristics when the forcing term depends explicitly on time. In non-autonomous systems, the forcing introduces time dependence in the particular solution, making the overall system non-autonomous, even though part of the response (the homogeneous solution) is autonomous.

Order overrides chaos

Dimensionality reduction of chaos by feedbacks and periodic forcing is a source of natural climate change, by P. Salmon, Climate Dynamics (2024)

Bottom line is that a forcing will tend to reduce chaos by creating a pattern to follow, thus the terminology of “forced response”. This has implications for climate prediction. The first few sentences of the abstract set the stage:

The role of chaos in the climate system has been dismissed as high dimensional turbulence and noise, with minimal impact on long-term climate change. However theory and experiment show that chaotic systems can be reduced or “controlled” from high to low dimensionality by periodic forcings and internal feedbacks. High dimensional chaos is somewhat featureless. Conversely low dimensional borderline chaos generates pattern such as oscillation, and is more widespread in climate than is generally recognised. Thus, oceanic oscillations such as the Pacific Decadal and Atlantic Multidecadal Oscillations are generated by dimensionality reduction under the effect of known feedbacks. Annual periodic forcing entrains the El Niño Southern Oscillation.

In Chapters 11 and 12 in Pukite, P., Coyne, D., & Challou, D. (2019). Mathematical Geoenergy. John Wiley & Sons, I cited forcing as a chaos reducer:

It is well known that a periodic forcing can reduce the erratic fluctuations and uncertainty of a near‐chaotic response function (Osipov et al., 2007; Wang, Yang, Zhou, 2013).

But that’s just a motivator. Tides are the key, acting primarily on the subsurface thermocline. Salmon’s figure comparing the AMO to Barents sea subsurface temperature is substantiating in terms of linking two separated regions by something more than a nebulous “teleconnection”.

Likely every ocean index has a common-mode mechanism. The tidal forcing by itself is close to providing an external synchronizing source, but requires what I refer to as a LTE modulation to zero in on the exact forced response. Read the previous blog post to get a feel how this works:

As Salmon notes, it’s known at some level that an annual/seasonal impulse is entraining or synchronizing ENSO, and also likely PDO and AMO. The top guns at NASA JPL point out that the main lunisolar terms are at monthly, 206 day, annual, 3 year, and 6 year periods, and this is what is used to model the forcing, see the following two charts

Now note how the middle panel in each of the following modeled climate indices does not change markedly. The most challenging aspect is the inherent structural sensitivity of the manifold1 mapping involved in LTE modulation. As the Darwin fit shows, the cross-validation is better than it may appear, as the out-of-band interval does not take much of a nudge to become synchronized with the data. Note also that the multidecadal nature of an index such as AMO may be ephemeral — the yellow cross-validation band does show valleys in what appears to be a longer multidecadal trend, capturing the long-period variations in the tides when modulated by an annual impulse – biennial in this case.

Model config repo: https://gist.github.com/pukpr/3a3566b601a54da2724df9c29159ce16?permalink_comment_id=5108154#gistcomment-5108154


1 The term manifold has an interesting etymology. From the phonetics, it is close to pronounced as “many fold”, which is precisely what’s happening here — the LTE modulation can fold over the forcing input many times in proportion to the mode of the standing wave produced. So that a higher standing wave will have “many folds” in contrast to the lowest standing wave model. At the limit, the QBO with an ostensibly wavenumber=0 mode will have no folds and will be to first-order a pass-through linear amplification of the forcing, but with likely higher modes mixed in to give the time-series some character.

Common forcing for ocean indices

In Mathematical Geoenergy, Chapter 12, a biennially-impulsed lunar forcing is suggested as a mechanism to drive ENSO. The current thinking is that this lunar forcing should be common across all the oceanic indices, including AMO for the Atlantic, IOD for the Indian, and PDO for the non-equatorial north Pacific. The global temperature extreme of the last year had too many simultaneous concurrences among the indices for this not to be taken seriously.

NINO34

PDO

AMO

IOD – East

IOD-West

Each one of these uses a nearly identical annual-impulsed tidal forcing (shown as the middle green panel in each), with a 5-year window providing a cross-validation interval. So many possibilities are available with cross-validation since the tidal factors are essentially invariantly fixed over all the climate indices.

The approach follows 3 steps as shown below

The first step is to generate the long-period tidal forcing. I go into an explanation of the tidal factors selected in a Real Climate comment here.

Then apply the lagged response of an annual impulse, in this case alternating in sign every other year, which generates the middle panel in the flow chart schematic (and the middle panel in the indexed models above).

Finally, the Laplace’s Tidal Equation (LTE) modulation is applied, with the lower right corner inset showing the variation among indices. This is where the variability occurs — the best approach is to pick a slow fundamental modulation and generate only integer harmonics of this fundamental. So, what happens is that different harmonics are emphasized depending on the oceanic index chosen, corresponding to the waveguide structure of the ocean basin and what standing waves are maximally resonant or amplified.

Note that for a dipole behavior such as ENSO, the LTE modulation will be mirror-inverses for the maximally extreme locations, in this case Darwin and Tahiti

A machine learning application is free to scrape the following GIST GitHub site for model fitting artifacts.

https://gist.github.com/pukpr/3a3566b601a54da2724df9c29159ce16

Another analysis that involved a recursively cycled fit between AMO and PDO. It switched fitting AMO for 2.5 minutes and then PDO for 2.5 minutes, cycling 50 times. This created a common forcing with an optimally shared fit, forcing baselined to PDO.

PDO

AMO

NINO34

IOD-East

IOD-West

Darwin

Tahiti

The table above shows the LTE modulation factors for Darwin and Tahiti model fits. The highlighted blocks show the phase of the modulation, which should have a difference of π radians for a perfect dipole and higher harmonics associated with it. (The K0 wavenumber = 0 has no phase, but just a sign). Of the modes that are shared 1, 45, 23, 36, 18, 39, 44, the average phase is 3.09, close to π (and K0 switches sign).

1.23-(-1.72) = 2.95 
1.47-(-2.05) = 3.52
-2.89-(0.166) = -3.056 
-0.367-(-2.58) = 2.213 
1.59-(-2.175) = 3.765 
0.27 - (-2.84) = 3.11 
-1.87 -1.14 = -3.01 

Average (2.95+3.52+3.056+2.213+3.765+3.11+3.01)/7 = 3.0891

Contrast to the IOD East/West dipole. Only the K0 (wavenumber=0) shows a reversal in sign. The LTE modulation terms are within 1 radian of each other, indicating much less of a dipole behavior on those terms. It’s possible that these sites don’t span a true dipole, either by its nature or from siting of the measurements.

Cross-validating a large interval span on PDO

using CC

using DTW metric, which pulls out more of the annual/semi-annual signal

adding a 3rd harmonic

Complement of the fitting interval, note the spectral composition maintains the same harmonics, indicating that the structure mapped to is stationary in the sense that the tidal pattern is not changing and the LTE modulation is largely fixed.

This is the resolved tidal forcing, finer than the annual impulse sampling used on the models above.

Below can see the primary 27.5545 lunar anomalistic cycle, mixed with the draconic 27.2122/13.606 cycle to create the 6/3 year modulation and the 206 day perigee-syzygy cycle (or 412 full cycle, as 206 includes antipodal full moon or new moon orientation).

(click on any image to magnify)

Fundy-mental (continued)

I’m looking at side-band variants of the lunisolar orbital forcing because that’s where the data is empirically taking us. I had originally proposed solving Laplace’s Tidal Equations (LTE) using a novel analytical derivation published several years ago (see Mathematical Geoenergy, Wiley/AG, 2019). The takeaway from the math results — given that LTEs form the primitive basis of the GCM-specific shallow-water approximation to oceanic fluid dynamics — was that my solution involved a specific type of non-linear modulation or amplification of the input tidal. However, this isn’t the typical diurnal/semi-diurnal tidal forcing, but because of the slower inertial response of the ocean volume, the targeted tidal cycles are the longer period monthly and annual. Moreover, as very few climate scientists are proficient at signal processing and all the details of aliasing and side-bands, this is an aspect that has remained hidden (again thank Richard Lindzen for opening the book on tidal influences and then slamming it shut for decades).

Continue reading