# Rainfall Variability Solved

I will preface this entry by suggesting that sometimes the statistics are so simple that they fall in your lap.

As a premise, we want to consider whether a simple stochastic model can generate statistical patterns of rainfall events.

We assume a maximum entropy probability distribution function which assumes an average energy of rainfall rate for a storm i:

$$P(E \mid E_i) = e^{\frac{-E}{E_i}}$$

The rationale for this is that the rainfall’s energy is proportional to the rate of the rainfall, since that amount of moisture had to be held aloft by gravity.

$$Rate_i \sim E_i$$

Yet we know that the Ei can vary from storm to storm, so we leave it as a conditional, and then set that as a maximal entropy estimator as well

$$p(E_i) = \alpha e^{- \alpha E_i}$$

then we integrate over the conditional’s range.

$$P(E) = \int_0^\infty P(E \mid E_i) p(E_i) \,dE_i$$

This results again in an integration formula lookup, which leads to the following solution:

$$P(E) = 2 \sqrt{\frac{E}{\bar{E}}} K_1(2 \sqrt{\frac{E}{\bar{E}}})$$

where K1 is the modified BesselK function of the second kind, in this case of order 1, which is found in any spreadsheet program (such as Excel).

Note that this is the same function that we used last week for determining wind speed distributions and that we used in TOC to determine the distribution of terrain slopes for the entire planet as well!  Again this reduces to the simplest model one can imagine — a function with only one adjustable parameter, that of the mean energy of a sampled rainfall rate measurement. In other words the expression contains a single number that represents an average rainfall rate.  That’s all we need, as maximum entropy fills in the rest.

So let us see how it works.  The analysis was compared against this recent paper:
“Can a simple stochastic model generate rich patterns of rainfall events?” by S.-M.Papalexiou, D.Koutsoyiannis, and A.Montanari [2011] and graphed as shown below in Figure 1. The green points constitute the BesselK fit which lies right on top of the blue empirical data set.

 Fig. 1: Cumulative rainfall distribution statistically gathered from several storm events.The BesselK fit assumes a double stochastic process, a MaxEnt within a storm and a MaxEnt across the storms.

From the table below reproduced from the paper, we can see that the mean value used in the BesselK distribution was exactly the same as that from the standard statistical moment calculation

Event# 1 2 3 4 5 6 7 All
SampleSize 9697 4379 4211 3539 3345 3331 1034 29536
Mean(mm/h) 3.89 0.5 0.38 1.14 3.03 2.74 2.7 2.29
StandardDeviation(mm/h) 6.16 0.97 0.55 1.19 3.39 2.2 2 4.11
Skewness 4.84 9.23 5.01 2.07 3.95 1.47 0.52 6.54
Kurtosis 47.12 110.24 37.38 5.52 27.34 2.91 -0.59 91
HurstExponent 0.94 0.79 0.89 0.94 0.89 0.87 0.97 0.89

In contrast, Papalexio try to apply Hurst-Kolmogorov statistics to the problem in search of a power law solution. They claim that the power law tail of -3 is somehow significant. I would suggest instead that the slight deviation in the tail region is likely caused by insufficient sampling in the data set. A slight divergence starts to occur at the 1 part in 5,000 level and there are only 30,000 points in the merged data set, indicating that statistical fluctuations could account for the difference. See Figure 2 below which synthesizes a BesselK distribution from the same sample size, and can clearly duplicate the deviation in the tail.

 Fig. 2: Monte Carlo sampling from a synthetic BesselK distribution reveals the same departure at low probability events.Fluctuations of 1 in 30,000 are clearly seen at this level, so care must be taken to not read fat-tail behavior that may not exist.

Bottomline is that this simple stochastic model beats the other simple stochastic model, which doesn’t seem as simple anymore once we apply an elementary uncertainty analysis to the data.

$$\int_0^\infty 2 K_0(2 \sqrt{y})\frac{1}{y} exp(-\frac{x}{y}) dy = G_{0,3}^{3,0}(x | 0,0,0)$$