Information Theory in Earth Science: Been there, done that

Following up from this post, there is a recent sequence of articles in an AGU journal on Water Resources Research under the heading: “Debates: Does Information Theory Provide a New Paradigm for Earth Science?”

By anticipating all these ideas, you can find plenty of examples and derivations (with many centered on the ideas of Maximum Entropy) in our book Mathematical Geoenergy.

Here is an excerpt from the “Emerging concepts” entry, which indirectly addresses negative entropy:

“While dynamical system theories have a long history in mathematics and physics and diverse applications to the hydrological sciences (e.g., Sangoyomi et al., 1996; Sivakumar, 2000; Rodriguez-Iturbe et al., 1989, 1991), their treatment of information has remained probabilistic akin to what is done in classical thermodynamics and statistics. In fact, the dynamical system theories treated entropy production as exponential uncertainty growth associated with stochastic perturbation of a deterministic system along unstable directions (where neighboring states grow exponentially apart), a notion linked to deterministic chaos. Therefore, while the kinematic geometry of a system was deemed deterministic, entropy (and information) remained inherently probabilistic. This led to the misconception that entropy could only exist in stochastically perturbed systems but not in deterministic systems without such perturbations, thereby violating the physical thermodynamic fact that entropy is being produced in nature irrespective of how we model it.

In that sense, classical dynamical system theories and their treatments of entropy and information were essentially the same as those in classical statistical mechanics. Therefore, the vast literature on dynamical systems, including applications to the Earth sciences, was never able to address information in ways going beyond the classical probabilistic paradigm.”

That is, there are likely many earth system behaviors that are highly ordered, but the complexity and non-linearity of their mechanisms makes them appear stochastic or chaotic (high positive entropy) yet the reality is that they are just a complicated deterministic model (negative entropy). We just aren’t looking hard enough to discover the underlying patterns on most of this stuff.

An excerpt from the Occam’s Razor entry, lifts from my cite of Gell-Mann

“Science and data compression have the same objective: discovery of patterns in (observed) data, in order to describe them in a compact form. In the case of science, we call this process of compression “explaining observed data.” The proposed or resulting compact form is often referred to as “hypothesis,” “theory,” or “law,” which can then be used to predict new observations. There is a strong parallel between the scientific method and the theory behind data compression. The field of algorithmic information theory (AIT) defines the complexity of data as its information content. This is formalized as the size (file length in bits) of its minimal description in the form of the shortest computer program that can produce the data. Although complexity can have many different meanings in different contexts (Gell-Mann, 1995), the AIT definition is particularly useful for quantifying parsimony of models and its role in science. “

Parsimony of models is a measure of negative entropy

Inverting non-autonomous functions

This is an algorithm based on minimum entropy (i.e. negative entropy) considerations which is essentially an offshoot of this paper Entropic Complexity Measured in Context Switching.

The objective is to apply negative entropy to find an optimal solution to a deterministically ordered pattern. To start, let us contrast the behavior of autonomous vs non-autonomous differential equations. One way to think about the distinction is that the transfer function for non-autonomous only depends on the presenting input. Thus, it acts like an op-amp with infinite bandwidth. Or below saturation it gives perfectly linear amplification, so that as shown on the graph to the right, the x-axis input produces an amplified y-axis output as long as the input is within reasonable limits.

Continue reading