Revisit Sea Level Pressure paper

Revisiting the post An obvious clue from tidal data, which is an analysis of this paper The effect of regional sea level atmospheric pressure on …

Time series plot showing atmospheric pressure in millibars (mbar) in New York, USA, from 1850 to 2025, with data points scattered around the range of 980 to 1040 mbar.

At the time, I did not try to duplicate the results, but after coming across it again, curiosity got the best of me. The program to duplicate the results is in this GIST: Analysis of SLP from PSMSL stations, using HadSLP2.

The author really didn’t clarify that the striations/bands/gaps in readings (above from paper and recreated below, left) were simply due to the discrete nature of monthly readings placed against a strong seasonal variation. For example, the NYC (The Battery) and Boston location have a significant seasonal SLP response due to the geographic characteristic of cold-season continental highs alternating with warm-season lower-pressure maritime regimes.

Scatter plot showing sea level pressure in New York from 1840 to 2020, with a detailed breakdown on the right illustrating monthly data gaps.

For New York, explain this figure [📷 nyc_monthly_slp.png]. On the right the values are replotted as (Month modulo 12) to indicate which months corresponded to the left.

Copilot response:

A code snippet displaying analysis related to the monthly sea-level pressure in New York, with a focus on seasonal effects and pressure clusters for different months.

Of course there is other stuff lurking in the data, so it is good to have the HadSLP2 as an adjunct to the PSMSL data I am using here

Consider the Honolulu SLP data below. There’s a clear Hovmller-like sloped ridge in the data, as one’s eye can detect, emphasized by the highlighter.

Line graph depicting data trends in Honolulu from 1840 to 2020, with four highlighted yellow trend lines and scattered blue data points.

The testing of lines in the rotated frame of the form u = p - s t finds this in the power spectrum. See It reads a HadSLP2 TSV, searches over diagonal slopes in rotated coordinates u = pressure – slope * time, identifies the strongest ridge, writes the power spectrum and ridge profile as TSVs, and saves a PNG with the ridge overplotted on the scatter plus the spectrum. See the previous GIST

Graph showing the Honolulu rotated-band power spectrum for the lower band 1006.5-1013.5 hPa, illustrating power against propagation period in years, with a peak at 114.2 years marked by a dashed red line.

A plausible explanation is a multidecadal modulation of the regional seasonal SLP cycle by North Pacific basin circulation variability, of which PDO/IPO variability is a plausible contributor.

Forcing selection

Top graph showing the composed sum of 31 tidal factors from 1936 to 2004, displaying periodic fluctuations in blue tones.
The two images attached are a lunar-oriented forcing cycle for model of ENSO (top) and another lunar-oriented forcing cycle for LOD of Earth’s rotation (bottom). Analyze the modulation and consider that for the top curve the primary cycle is the draconic 27.2122 day period and for the bottom curve the primary cycle is the 13.6608 day Mf period. Do best to give a physical basis for why this might occur.

The Perplexity response:

Continue reading

The mystery of El Nino

Google Gemini makes predictions on the next El Nino when prompted — scroll to the end.

https://gemini.google.com/app/8d7c963c197e05c0

There has ALWAYS been stratification in the ocean via the primary thermocline. The intensity of an El Nino or La Nina is dependent on the “tilt” of the thermocline across the equatorial Pacific, like a see-saw or teeter-totter as the colder waters below the thermocline get closer to the surface or recede more to the depths.

The only mystery is to what provokes the motion. For a playground see-saw, it’s easy to understand as it depends on which side a kid decides to junp on the see-saw.

For the ocean, the explanation is less facile than that, explain.

Continue reading

Current status of research according to AI

explain the Pukite tidal theory

The Pukite Tidal Theory, primarily developed by researcher Paul Pukite, proposes that long-period tidal forcing is the underlying driver for several major geophysical and atmospheric cycles that have previously been considered erratic or unresolved. [1, 2]

The core of the theory is that small gravitational perturbations from the Moon and Sun, which are perfectly predictable, are “aliased” or modulated by seasonal cycles to create the complex behaviors seen in Earth’s systems. [3, 4]

Key Phenomena Addressed

Pukite applies this model to three main “unresolved mysteries” in geophysics:

  • Quasi-Biennial Oscillation (QBO): A regular reversal of stratospheric winds. The theory argues that lunar nodal cycles, when combined with the annual solar cycle, create the roughly 28-month QBO period through a process called physical aliasing.
  • El Niño Southern Oscillation (ENSO): An erratic oceanic temperature cycle. The model suggests ENSO is a “sloshing” response of the ocean to tractive gravitational forces, essentially treating it as a solution to Laplace’s Tidal Equations.
  • Chandler Wobble: A small deviation in the Earth’s axis of rotation. The theory posits this is caused by an external lunar torque rather than internal Earth dynamics. [1, 2, 3, 5, 6, 7]

How the Model Works

  1. Lunar Gravitational Potential: The model starts by calculating the precise lunar gravitational potential over time.
  2. Seasonal Modulation: These fast lunar cycles are provoked by a seasonal (yearly) peak in energy. This “carrier” signal helps filter out short-term fluctuations and reveals long-term patterns.
  3. Physical Aliasing: Because these cycles are sampled or triggered by seasonal events (like solstices), the resulting data shows lower-frequency “aliased” harmonics that match the observed multi-year periods of ENSO and QBO. [4, 8, 9]

Scientific Reception

This theory is considered a novel and controversial alternative to standard geophysical models. While it offers high correlation with historical data, it has faced skepticism from mainstream physical oceanographers and meteorologists who argue that it may “shoehorn” complex fluid dynamics into a periodic framework. Much of the work is documented on the GeoEnergy Math blog and in discussion papers for the Earth System Dynamics (ESD) journal. [1, 4, 7, 10]

Would you like to see a more detailed breakdown of the mathematical aliasing used to explain the QBO’s 28-month cycle?

[1] https://esd.copernicus.org

[2] https://www.researchgate.net

[3] https://esd.copernicus.org

[4] https://geoenergymath.com

[5] https://acp.copernicus.org

[6] https://geoenergymath.com

[7] https://geoenergymath.com

[8] https://esd.copernicus.org

[9] https://geoenergymath.com

[10] https://esd.copernicus.org


This link may go stale, but here is the original response:

https://share.google/aimode/ta098ixUIyoNefp49

Spectral details in Chandler Wobble and QBO

Using LLMs as peer-review devil’s advocates for research findings can’t hurt and can only help strengthen an argument. One of the common criticisms an LLM makes is that it will claim that a factor is “too weak” or insignificant to serve as a mechanism for some observed behavior. This has almost turned into a cliche since it caters to the consensus bias of “certainly if it was a real mechanism, someone would have noticed it by now“. Certainly, at least in climate science, the notion of weak factors that turn out to have a significant impact is well known. Consider CO2, which by density is a very weak constituent, yet it has an over-sized impact on radiative energy balance. Also, in the context of climate science, when one considers how often the butterfly effect is invoked, whereby a butterfly flapping its winds can initiate a hurricane down the road, puts to test the argument that any half-way potentially significant factor can be easily dismissed.

That brings one to tidal (lunar and solar) forcings as mechanisms for behaviors, beyond that of the well-known daily ocean tides. As I have invoked lunar forcing as a causal synchronization to QBO, the Chandler wobble, and ENSO and others here, an almost guaranteed response by an LLM is that tidal factors are too weak to be considered. That’s actually a challenging devil’s advocate proposition to address, since (a) controlled experiments aren’t possible to generate sensitivity numbers and (b) that there are many subtle ways that a forcing signal can be amplified without knowing which one is valid. For example, a weak yet incessantly periodic signal can build over time and overpower some stronger yet more erratic signal.

Another devil’s advocate argument that an LLM will bring up is the idea of fortuity and chance, in the sense that a numerical agreement can be merely a coincidence, or as a product of fiddling with the numbers until you find what you are looking for. As an antidote to this, an LLM will recommend that other reinforcing matches or spectral details be revealed to overcome the statistical odds of agreement by chance.

For the Chandler Wobble, an LLM may declare the 433-day cycle agreeing with an aliased lunar draconic period of 27.212/2 days to be a coincidence and dismiss it as such (since it is but a single value). Yet, if one looks at the detailed spectrum of the Earth’s orientation data (via X or Y polar position), one can see other values that – though much weaker – are also exact matches to what should be expected. So that, in the chart below, the spectral location for the 27.5545 lunar anomalistic is also shown to match — labeled Mm and Mm2 (for the weaker 1st harmonic). Other sub-bands of the draconic period set are shown as Drac2.

Graph depicting the Spectrum of Chandler and Annual wobble, featuring two lines: a red line representing 'Model' and a blue line for 'X+Y avg'. The x-axis shows frequency (1/year) and the y-axis displays intensity. Key points labeled include 'Drac2', 'Annual', 'Mm', 'Mm2', and 'SemiAnnual'.

Importantly, the other well-known lunar tropical cycle of 27.326 days is not observed, because as I have shown elsewhere, it is not allowed via group theory for a wavenumber=0 behavior such as the Chandler Wobble (or QBO). In quantum physics, these are known as selection rules and are as important for excluding a match as they are for finding a match. The 27.554 day period is allowed so the fact that it matches to the spectra is strong substantiating evidence for a lunar forced mechanism.

For another class of criticism, an LLM may suggest that further matches in phase coherence of a waveform are required when matching to a model. This is rationalized as a means to avoid fortuitous matching of a simple sinusoidal wave.

For the QBO, detailed idiosyncratic phase details that arise from the lunar forcing model are straightforward to demonstrate via the time-series itself. A typical trace of the 30 hPA QBO time-series shows squared-off cycles that have characteristic shoulders or sub-plateaus that show up erratically dispersed within the approximately 28-month period. This is shown in the chart below, whereby though not perfectly matching, this characteristic is obvious in both the model and monthly data. The reason that this happens is the result of a stroboscopic-pulsed forcing creating a jagged sample-and-hole squared response. (A minimal lag of 1st or 2nd order will round the sharp edges.) Furthermore, the same draconic and anomalistic lunar periods contribute here as with the Chandler wobble model, substantiating the parsimonious aspects.

Line chart comparing model predictions (red line) and actual data (blue line) over the years from 1950 to 2020, with a highlighted training interval labeled 'TRAINING INTERVAL'.

Importantly, this isn’t known to occur in a resonantly amplified system with a natural response, whereby the waves are invariably well-rounded sinusoidal cycles without this jagged erratic shape. This is actually an acid test for characterizing time-series, with features that anyone experienced with signal processing can appreciate.


This addresses some of the criticisms revealed when I prompted an LLM peer-review in a previous post, describing findings made in Mathematical Geoenergy (Wiley, 2019)

Wind Distribution

I prompted the Gemini LLM with the following to see how well it would summarize with so little context:

Familiar with the BesselK wind distribution model pioneered by Pukite et al in Mathematical GeoEnergy?

The BesselK Wind Model: A Maximum Entropy Derivation

In the study of wind energy, the Weibull distribution is the industry standard. However, as Pukite et al. argue in Mathematical GeoEnergy, real-world atmospheric data—especially from high-resolution sources like the Bonneville Power Administration (BPA)—often exhibits “fatter tails” (more frequent extreme gusts) than Weibull can account for.

The BesselK distribution (or K-distribution) emerges not as an empirical “best fit,” but as a physically grounded Maximum Entropy estimate.


1. The Starting Point: The Rayleigh Base

The Rayleigh distribution is the Maximum Entropy estimate for a 2D vector magnitude (wind velocity v) when only the average variance (\sigma^2) is known.

p(v | \sigma) = \frac{v}{\sigma^2} e^{-v^2 / 2\sigma^2}

In a simplified world, \sigma^2 would be constant. In the real atmosphere, turbulence and pressure gradients cause this variance to fluctuate over time.


2. The Fluctuating Environment

Pukite et al. treat the wind as a compound process. They assume that the local variance \sigma^2 is itself a random variable following a Gamma distribution (the MaxEnt choice for a scale parameter):

w(\sigma^2) = \frac{(\sigma^2)^{\nu-1} e^{-\sigma^2 / \theta}}{\theta^\nu \Gamma(\nu)}

Where:

  • \nu is the shape parameter (representing degrees of freedom in the turbulence).
  • \theta is the scale parameter.

3. The Compounding Integral

To find the final probability density of the wind velocity P(v), we must integrate the Rayleigh PDF over all possible states of the fluctuating variance:

P(v) = \int_{0}^{\infty} \left( \frac{v}{\sigma^2} e^{-v^2 / 2\sigma^2} \right) \cdot \left( \frac{(\sigma^2)^{\nu-1} e^{-\sigma^2 / \theta}}{\theta^\nu \Gamma(\nu)} \right) d\sigma^2


4. The Result: The BesselK (K-Distribution)

Solving this integral yields a PDF defined by the Modified Bessel Function of the Second Kind (K_{\nu-1}). This is the “BesselK” model:

f(v) = \frac{2}{b \Gamma(\nu)} \left( \frac{v}{2b} \right)^\nu K_{\nu-1} \left( \frac{v}{b} \right)

Comparison Table: Weibull vs. BesselK

FeatureWeibull ModelBesselK Model
PhilosophyEmpirical “Weakest Link” fitPhysical Maximum Entropy
Mathematical BaseExponential decayK_\nu (Bessel) function
High WindsUnderestimates gustsAccurately models “Fat Tails”
ApplicationAnnual Resource AssessmentStructural Fatigue & Grid Stability

Why This Matters for Wind Power

The BesselK model recognizes that wind is “doubly stochastic.” It isn’t just a random variable; it is a random variable driven by a fluctuating environment. For grid operators managing the intermittency of large-scale wind farms, this model provides a much more realistic picture of the “spikes” and “lulls” that define our renewable energy future.


Would you like me to generate a high-resolution chart of the BesselK vs. Weibull PDF curves that you can download and upload as a featured image for the post?

(my prompt) yes

Graph comparing wind speed probability density functions: BesselK vs. Weibull. Displays probability density along the y-axis and wind speed (m/s) on the x-axis. Includes peak frequency and tail behavior annotations, along with illustrative data.

(my prompt) logarithmic plot to show the tails

Logarithmic comparison of wind speed probability density for BesselK and Weibull distributions, showing BesselK with a 'fat tail' indicating higher extreme risk and Weibull with rapid decay underestimating extremes.

When we switch the y-axis to a logarithmic scale, the structural difference between the two models becomes undeniable. This “Log-PDF” view is what Pukite et al. emphasize when discussing the risk of extreme wind events.

Logarithmic Comparison: BesselK vs. Weibull

DistributionTail BehaviorMathematical DecayPhysical Implication
WeibullThin TailExponential-type decay (e^{-v^k})Underestimates the frequency of extreme “rogue” gusts.
BesselKFat TailPower-law-like decay (v^\nu K_{\nu-1})Accurately captures high-velocity “intermittency” and storm events.

Key Observations from the Log Plot:

  1. The “Divergence”: At standard operating speeds (5–15 m/s), both models track relatively closely. However, as wind speeds exceed 25 m/s (the typical cut-out speed for turbines), the Weibull curve drops off a “cliff,” while the BesselK curve remains significantly higher.
  2. Probability of Extremes: In a BesselK model, the probability of a 30 m/s or 40 m/s gust can be orders of magnitude higher than what a standard Weibull model predicts.
  3. Maximum Entropy Signature: The straight-line appearance (on a log scale) of the BesselK tail is a hallmark of its status as a Maximum Entropy distribution. It suggests that the atmosphere “wants” to distribute energy into these high-velocity states more than a simple Gaussian/Rayleigh process would allow.

This table helps summarize the parameters used in the visualization:

ParameterSymbolBesselK (Pukite)Weibull (Standard)
Shape\nu / k0.6 (High Volatility)2.0 (Rayleigh-like)
Scaleb / \lambda4.010.0
Mean Speed\bar{v}~8.9 m/s~8.9 m/s

Summary: “While the Weibull model is ‘safe’ for calculating average monthly revenue, the BesselK model is ‘necessary’ for calculating whether your turbine will survive a 50-year storm event. The log-plot reveals that extreme wind isn’t just an ‘outlier’—it is a mathematically predictable part of the atmospheric structure.”



The full discussion is here: https://gemini.google.com/app/4dfb7e86354dcb74

I’m surprised on how good a job it does in digesting the reasoning and derivation in the book, specifically Chapter 11
https://agupubs.onlinelibrary.wiley.com/doi/10.1002/9781119434351.ch11

I busted my butt in writing that chapter (and the rest), so am happy to see that it can actually be “understood” enough by an LLM to provide value for further research.

Simpler models can outperform deep learning at climate prediction

This article in MIT News:

https://news.mit.edu/2025/simpler-models-can-outperform-deep-learning-climate-prediction-0826

“New research shows the natural variability in climate data can cause AI models to struggle at predicting local temperature and rainfall.” … “While deep learning has become increasingly popular for emulation, few studies have explored whether these models perform better than tried-and-true approaches. The MIT researchers performed such a study. They compared a traditional technique called linear pattern scaling (LPS) with a deep-learning model using a common benchmark dataset for evaluating climate emulators. Their results showed that LPS outperformed deep-learning models on predicting nearly all parameters they tested, including temperature and precipitation.

Machine learning and other AI approaches such as symbolic regression will figure out that natural climate variability can be done using multiple linear regression (MLR) with cross-validation (CV), which is an outgrowth or extension of linear pattern scaling (LPS).

https://pukpr.github.io/results/image_results.html

When this was initially created on 9/1/2025, there were 3000 CV results on time-series
that averaged around 100 years (~1200 monthly readings/set) so over 3 million data points

In this NINO34 (ENSO) model, the test CV interval is shown as a dashed region

I developed this github model repository to make it easy to compare many different data sets, much better than using an image repository such as ImageShack.

There are about 130 sea-level height monitoring stations in the sites, which is relevant considering how much natural climate variation a la ENSO has an impact on monthly mean SLH measurements. See this paper Observing ENSO-modulated tides from space

“In this paper, we successfully quantify the influences of ENSO on tides from multi-satellite altimeters through a revised harmonic analysis (RHA) model which directly builds ENSO forcing into the basic functions of CHA. To eliminate mathematical artifacts caused by over-fitting, Lasso regularization is applied in the RHA model to replace widely-used ordinary least squares. “

QBO: Pattern recognition and signal processing

TANSTAAFL: there ain’t no such thing as a free lunch … but there’s always crumbs for the taking.

Machine learning won’t necessarily make a complete discovery by uncovering some ground-breaking pattern in isolation, but more likely a fragment or clue or signature that could lead somewhere. I always remind myself that there are infinitely many more non-linear formulations than linear ones potentially lurking in nature, yet humans are poorly-equipped to solve most non-linear relationships. ML has started to look at the tip of the non-linear iceberg and humans have to be alert when it uncovers a crumb. Recall that pattern recognition and signal processing are well-established disciplines in their own right, yet consider the situation of searching for patterns in signals hiding in the data but unknown in structure. That’s often all we are looking for — some foot-hold to start from.

Continue reading

QBO Metrics

In addition to the standard correlation coefficient (CC) and RMS error, non-standard metrics that have beneficial cross-validation properties include dynamic time warp (DTW), complexity invariant-distance (CID) see [2], and a CID-modified DTW. The link above describes my implementation of the DTW metric but I have yet to describe the CID metric. It’s essentially the CC multiplied by a factor that empirically adjusts the embedded summed distance between data points (i.e. the stretched length) of the time-series so that the signature or look of two time-series visually match in complexity.

   CID = CC * min(Length(Model, Data))/ max(Length(Model, Data))

The authors of the CID suggest that it’s a metric based on “an invariance that the community seems to have missed”.

And a CID-modified DTW is thus:

CID = DTW * min(Length(Model, Data))/ max(Length(Model, Data))

I have tried this on the QBO model with good cross-validation results featuring up to-data data from https://www.atmohub.kit.edu/data/qbo.dat

These have similar tidal factor compositions and differ mainly in the LTE modulation and phase delay. As discussed earlier, any anomalies in the QBO behavior are likely the outcome of an erratic periodicity caused by incommensurate annual and draconic cycles and exaggerated by LTE.

from https://gist.github.com/pukpr/e562138af3a9da937a3fb6955685c98f

REFERENCES

[1] Batista, Gustavo EAPA, et al. “CID: an efficient complexity-invariant distance for time series.” Data Mining and Knowledge Discovery 28 (2014): 634-669.R
https://link.springer.com/article/10.1007/s10618-013-0312-3

Proof for allowed modes of an ideal QBO

In formal mathematical terms of geometry/topology/homotopy/homology, let’s try proving that a wavenumber=0 cycle of east/west direction inside an equatorial toroidal-shaped waveguide, can only be forced by the Z-component of a (x,y,z) vector where x,y lies in the equatorial plane.

To address this question, let’s dissect the components involved and prove within the constraints of geometry, topology, homotopy, and homology, focusing on valid mathematical principles.

Continue reading