Sorry to have to point this out, but it’s not my fault that geophysicists and climatologists can’t perform controlled experiments to test out various hypotheses. It’s not their fault either. It’s all nature’s decision to make gravitational forces so weak and planetary objects so massive to prevent anyone from scaling the effect to laboratory size to enable a carefully controlled experiment. One can always create roughly-equivalent emulations, such as a magnetic field experiment (described in the previous blog post) and validate a hypothesized behavior as a controlled lab experiment. Yet, I suspect that this would not get sufficient buy-in, as it’s not considered the actual real thing.
And that’s the dilemma. By the same token that analog emulators will not be trusted by geophysicists and climatologists, so too scientists from other disciplines will remain skeptical of untestable claims made by earth scientists. If nothing definitive comes out of a thought experiment that can’t be reproduced by others in a lab, they remain suspicious, as per their education and training.
It should therefore work both ways. As featured in the previous blog post, the model of the Chandler wobble forced by lunar torque needs to be treated fairly — either clearly debunked or considered as an alternative to the hazy consensus. ChatGPT remains open about the model, not the least bit swayed by colleagues or tribal bias. As the value of the Chandler wobble predicted by the lunar nodal model (432.7 days) is so close to the cited value of 433 days, as a bottom-line it should be difficult to ignore.
There are other indicators in the observational data to further substantiate this, see Chandler Wobble Forcing. It also makes sense in the context of the annual wobble.
As it stands, the lack of an experiment means a more equal footing for the alternatives, as they are all under equal amounts of suspicion.
Same goes for QBO. No controlled experiment is possible to test out the consensus QBO models, despite the fact that the Plumb and McEwan experiment is claimed to do just that. Sorry, but that experiment is not even close to the topology of a rotating sphere with a radial gravitational force operating on a gas. It also never predicted the QBO period. In contrast, the value of the QBO predicted by the lunar nodal model (28.4 months) is also too close to the cited value of 28 to 29 months to ignore. This also makes sense in the context of the semi-annual oscillation (SAO) located above the QBO .
Both the Chandler wobble and the QBO have the symmetry of a global wavenumber=0 phenomena so therefore only nodal cycles allowed — both for lunar and solar.
Next to ENSO. As with LOD modeling, this is not wavenumber=0 symmetry, as it must correspond to the longitude of a specific region. No controlled experiment is possible to test out the currently accepted models, premised as being triggered by wind shifts (an iffy cause vs. effect in any case). The mean value of the ENSO predicted by the tidal LOD-caibrated model (3.80 years modulated by 18.6 years) is too close to the cited value of 3.8 years with ~200 years of paleo and direct measurement to ignore.
In BLUE below is the LOD-calibrated tidal forcing, with linear amplification
In BLUE again below is a non-linear modulation of the tidal forcing according to the Laplace’s Tidal Equation solution, and trained on an early historical interval. This is something that a neural network should be able to do, as it excels at fitting to non-linear mappings that have a simple (i.e. low complexity) encoding — in this case it may be able to construct a Taylor series expansion of a sinusoidal modulating function.
The neural network’s ability to accurately represent a behavior is explained as a simplicity bias — a confounding aspect of machine learning tools such as ChatGPT and neural networks. The YouTube video below explains the counter-intuitive notion of how a NN with a deep set of possibilities tends to find the simplest solution and doing this without over-fitting the final mapping.
So that deep neural networks are claimed to have a built-in Occam’s Razor propensity, finding the most parsimonious input-output mappings when applied to training data. This is spot on with what I am doing with the LTE mapping, but bypassing the NN with a nonlinear sinusoidal modulation optimally fit on training data by a random search function.
I am tempted to try a NN on the ENSO training set as an experiment and see what it finds.
April 2, 2023
“I am tempted to try a NN on the ENSO training set as an experiment and see what it finds.”
2 thoughts on “Controlled Experiments”
What is simplicity bias in the context of a neural network selecting solutions?
Simplicity bias, in the context of a neural network selecting solutions, refers to the preference of a neural network to learn simpler patterns or representations before more complex ones. This tendency can arise due to the architecture, training methods, or other characteristics of the network.
In the early stages of training, neural networks typically develop simple and general features, such as edge or color detectors. As training progresses, these features become more specialized and complex, allowing the network to learn and recognize more intricate patterns. This process is observed in various types of neural networks, including convolutional neural networks (CNNs) for image recognition and transformers for natural language processing.
Simplicity bias has both advantages and disadvantages:
Generalization: By learning simple features and patterns first, neural networks can generalize better to unseen data. This is because simple features are more likely to be shared across different examples in the training set.
Efficiency: Learning simpler patterns requires fewer computational resources and may lead to faster convergence during training.
Overfitting: If a network is biased towards simpler solutions, it might not be able to learn more complex relationships that are crucial for solving specific tasks. This can lead to overfitting, where the network performs well on the training data but poorly on unseen data.
Limited expressiveness: A network with a strong simplicity bias might not be able to represent more complex patterns or relationships, limiting its ability to solve certain tasks.
Understanding and controlling the simplicity bias in neural networks can help improve their performance on various tasks, making them more effective at generalizing to new data and learning complex relationships. Researchers are actively studying the factors that influence simplicity bias in neural networks and exploring techniques to control and exploit this bias for better performance.
Yet researchers such as Ard Louis claim that simplicity bias may not lead to overfitting. Why is that?
Ard Louis and his collaborators argue that simplicity bias in neural networks may not lead to overfitting because it can actually help the networks learn generalizable patterns from the data. In their research, they show that the inductive biases of neural networks, including simplicity bias, can encourage the learning of simpler functions, which often leads to better generalization.
The reasoning behind this claim is that simpler patterns are more likely to be shared across different examples in the training set, and thus, by learning these simple patterns, the network is more likely to generalize well to unseen data. This is related to the concept of Occam’s Razor, which states that, among competing hypotheses, the simplest one is often the best.
However, it is important to note that simplicity bias itself is not a guarantee against overfitting. There are cases where a model can still overfit if it relies too heavily on simple patterns and fails to capture more complex relationships that are essential for solving specific tasks. The key is to strike a balance between learning simple patterns for generalization and learning complex patterns for accurate task performance.
In summary, simplicity bias in neural networks may not always lead to overfitting, as it can help the network learn generalizable patterns. However, the relationship between simplicity bias and overfitting is complex and depends on the specific network architecture, training data, and task at hand. It is crucial to monitor and control the simplicity bias to ensure the right balance between generalization and accuracy in the trained model.
In the context of simplicity bias, why is a gradient descent not as critical and performing random selection may be just as effective?
The idea that random selection may be just as effective as gradient descent in the context of simplicity bias comes from the research by Ard Louis and his collaborators. They argue that, under certain conditions, simpler models can be found just by randomly sampling the weight space of a neural network rather than using gradient-based optimization techniques such as gradient descent.
This idea is grounded in the observation that, for many real-world problems, the set of functions that can be represented by a neural network is heavily biased towards simpler functions. In other words, the space of possible functions that a neural network can learn contains more simple functions than complex ones. As a result, randomly sampling the weight space of a neural network has a higher probability of finding simpler models, which can generalize better to unseen data.
In their work, they demonstrate that random search can indeed find simple models with good generalization performance for some problems. However, it is important to note that this result is not universal, and there are cases where gradient-based optimization is still necessary to find good models efficiently.
Gradient descent and other gradient-based optimization techniques have been shown to work well in practice for training neural networks on a wide range of tasks. These methods take advantage of the structure in the loss landscape and can efficiently find good models by iteratively updating the weights of the network.
In summary, while random selection may be effective in some cases due to simplicity bias, gradient descent and other gradient-based optimization techniques are still critical for efficiently training neural networks on many tasks. The relative effectiveness of random selection versus gradient-based optimization depends on the specific problem and the characteristics of the function space that the neural network can represent.
Pingback: Sub(Surface)Stack | GeoEnergy Math