The behavior of complex systems, particularly in fluid dynamics, is traditionally described by high-dimensional systems of equations like the Navier-Stokes equations. While providing practical applications as is, these models can obscure the underlying, simplified mechanisms at play. It is notable that ocean modeling already incorporates dimensionality reduction built in, such as through Laplace’s Tidal Equations (LTE), which is a reduced-order formulation of the Navier-Stokes equations. Furthermore, the topological containment of phenomena like ENSO and QBO within the equatorial toroid , and the ability to further reduce LTE in this confined topology as described in the context of our text Mathematical Geoenergy underscore the inherent low-dimensional nature of dominant geophysical processes. The concept of hidden latent manifolds posits that the true, observed dynamics of a system do not occupy the entire high-dimensional phase space, but rather evolve on a much lower-dimensional geometric structure—a manifold layer—where the system’s effective degrees of freedom reside. This may also help explain the seeming paradox of the inverse energy cascade, whereby order in fluid structures seems to maintain as the waves become progressively larger, as nonlinear interactions accumulate energy transferring from smaller scales.
Discovering these latent structures from noisy, observational data is the central challenge in state-of-the-art fluid dynamics. Enter the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm, pioneered by Brunton et al. . SINDy is an equation-discovery framework designed to identify a sparse set of nonlinear terms that describe the evolution of the system on this low-dimensional manifold. Instead of testing all possible combinations of basis functions, SINDy uses a penalized regression technique (like LASSO) to enforce sparsity, effectively winnowing down the possibilities to find the most parsimonious, yet physically meaningful, governing differential equations. The result is a simple, interpretable model that captures the essential physics—the fingerprint of the latent manifold. The SINDy concept is not that difficult an algorithm to apply as a decent Python library is available for use, and I have evaluated it as described here.
Applying this methodology to Earth system dynamics, particularly the seemingly noisy, erratic, and perhaps chaotic time series of sea-level variation and climate index variability, reveals profound simplicity beneath the complexity. The high-dimensional output of climate models or raw observations can be projected onto a model framework driven by remarkably few physical processes. Specifically, as shown in analysis targeting the structure of these time series, the dynamics can be cross-validated by the interaction of two fundamental drivers: a forced gravitational tide and an annual impulse.
The presence of the forced gravitational tide accounts for the regular, high-frequency, and predictable components of the dynamics. The annual impulse, meanwhile, serves as the seasonal forcing function, representing the integrated effect of large-scale thermal and atmospheric cycles that reset annually. The success of this sparse, two-component model—where the interaction of these two elements is sufficient to capture the observed dynamics—serves as the ultimate validation of the latent manifold concept. The gravitational tides with the integrated annual impulse are the discovered, low-dimensional degrees of freedom, and the ability of their coupled solution to successfully cross-validate to the observed, high-fidelity dynamics confirms that the complex, high-dimensional reality of sea-level and climate variability emerges from this simple, sparse, and interpretable set of latent governing principles. This provides a powerful, physics-constrained approach to prediction and understanding, moving beyond descriptive models toward true dynamical discovery.
An entire set of cross-validated models is available for evluation here: https://pukpr.github.io/examples/mlr/.
This is a mix of climate indices (the 1st 20) and numbered coastal sea-level stations obtained from https://psmsl.org/
https://pukpr.github.io/examples/map_index.html
- nino34 — NINO34 (PACIFIC)
- nino4 — NINO4 (PACIFIC)
- amo — AMO (ATLANTIC)
- ao — AO (ARCTIC)
- denison — Ft Denison (PACIFIC)
- iod — IOD (INDIAN)
- iodw — IOD West (INDIAN)
- iode — IOD East (INDIAN)
- nao — NAO (ATLANTIC)
- tna — TNA Tropical N. Atlantic (ATLANTIC)
- tsa — TSA Tropical S. Atlantic (ATLANTIC)
- qbo30 — QBO 30 Equatorial (WORLD)
- darwin — Darwin SOI (PACIFIC)
- emi — EMI ENSO Modoki Index (PACIFIC)
- ic3tsfc — ic3tsfc (Reconstruction) (PACIFIC)
- m6 — M6, Atlantic Nino (ATLANTIC)
- m4 — M4, N. Pacific Gyre Oscillation (PACIFIC)
- pdo — PDO (PACIFIC)
- nino3 — NINO3 (PACIFIC)
- nino12 — NINO12 (PACIFIC)
- 1 — BREST (FRANCE)
- 10 — SAN FRANCISCO (UNITED STATES)
- 11 — WARNEMUNDE 2 (GERMANY)
- 14 — HELSINKI (FINLAND)
- 41 — POTI (GEORGIA)
- 65 — SYDNEY, FORT DENISON (AUSTRALIA)
- 76 — AARHUS (DENMARK)
- 78 — STOCKHOLM (SWEDEN)
- 111 — FREMANTLE (AUSTRALIA)
- 127 — SEATTLE (UNITED STATES)
- 155 — HONOLULU (UNITED STATES)
- 161 — GALVESTON II, PIER 21, TX (UNITED STATES)
- 163 — BALBOA (PANAMA)
- 183 — PORTLAND (MAINE) (UNITED STATES)
- 196 — SYDNEY, FORT DENISON 2 (AUSTRALIA)
- 202 — NEWLYN (UNITED KINGDOM)
- 225 — KETCHIKAN (UNITED STATES)
- 229 — KEMI (FINLAND)
- 234 — CHARLESTON I (UNITED STATES)
- 245 — LOS ANGELES (UNITED STATES)
- 246 — PENSACOLA (UNITED STATES)
Crucially, this analysis does not use the SINDy algorithm, but a much more basic multiple linear regression (MLR) algorithm predecessor, which I anticipate being adapted to SINDy as the model is further refined. Part of the rationale for doing this is to maintain a deep understanding of the mathematics, as well as providing cross-checking and thus avoiding the perils of over-fitting, which is the bane of neural network models.
Also read this intro level on tidal modeling, which may form the fundamental foundation for the latent manifold: https://pukpr.github.io/examples/warne_intro.html. The coastal station at Wardemunde in Germany along the Baltic sea provided a long unbroken interval of sea-level readings which was used to calibrate the hidden latent manifold that in turn served as a starting point for all the other models. Not every model works as well as the majority — see Pensacola for a sea-level site and and IOD or TNA for climate indices, but these are equally valuable for understanding limitations (and providing a sanity check against an accidental degeneracy in the model fitting process) . The use of SINDy in the future will provide additional functionality such as regularization that will find an optimal common-mode latent layer,.

































