This is the second series of posts I’m writing on topics related to what we are covering in our book on heavy-tails (which I discussed in an earlier post). The first was on the catastrophe principle (subexponential distributions) and now we move to one of the most commonly discussed aspects of heavy-tailed distributions: power laws and scale invariance.
Scale invariance in our daily lives
In our daily lives, many things that we come across have a typical size, or “scale,” that we associate with them. For example, the ratio of the maximum to minimum heights and weights that we see in a given day is usually less than 3, so none deviates too much from the population average. In contrast, the ratio of the maximum to minimum income of people we see in a particular day may often be 100 or more! This contrast is a consequence of the fact that light-tailed distributions, such as heights and weights, tend to have a “typical scale,” while many heavy-tailed distributions, such as incomes, are “scale invariant,” i.e., regardless of the scale on which you look at them, they look the same.
Upon first encounter, scale invariance is a particularly mysterious aspect of heavy-tailed distributions, since it is natural to think of the average of a distribution as a good predictor of what samples will occur. The fact that this is no longer true for scale invariant distributions leads to counter-intuitive properties. For example, consider the old economics joke: “If Bill Gates walks into a bar, on average, everybody in the bar is a millionaire.”
Though initially mysterious, scale invariance is a beautiful and widely-observed phenomenon that has received attention broadly beyond mathematics and statistics, e.g., in physics, computer
science, and economics.
For example, fractals give a beautiful view of scale invariance, but more concretely, it is an important phenomena in both classical and quantum field theory, as well as statistical mechanics. In fact, it is closely tied to the notion of “universality” in physics, which relates to the fact that widely different systems can be described by the same underlying theory.
In the context of network science, scale invariance has received considerable attention. Widely varying networks have been found to have scale invariant degree distributions (and are thus termed “scale-free networks”), and this observation has had dramatic impacts for our understanding of the structural properties of networks. So, clearly, scale invariance is a broad area, but in these posts, we’ll just focus on scale invariance in the context of probability and statistics.
In particular, in this set of posts, I want to talk about the property of “scale invariance” and its connections with “power law” distributions, a.k.a., Pareto distributions. Note that both “scale invariance” and “power law” are often used synonymously with “heavy-tailed,” and thus, it is important to start by pointing out that not all heavy-tailed distributions are scale invariant or power law (though all scale invariant distributions are heavy-tailed, as are all power law distributions).
The main goal of this set of posts is to describe how to generalize and formalize the notions of scale-invariance and power-law as a class of heavy-tailed distributions termed “regularly varying distributions” that is particularly appealing from a mathematical perspective. Further, in order to illustrate the usefulness of this class, I’ll try to highlight a variety of properties and examples of the class.
Scale invariance and power laws
To this point, I have only briefly introduced scale-invariance informally as the property that the distribution looks the “same” regardless of the scale on which it is looked at. A more careful way to say this is that, if the scale (or units) with which the samples from the distribution are measured is changed, then the shape of the distribution is basically unchanged. This is formalized by the following definition.
for all satisfying .
To interpret the definition of scale-invariant, one can think of as the “change of scale” for the units being used. With this interpretation, the definition says that the shape of the distribution remains unchanged, up to a multiplicative factor if the measurements are scaled by .
Scale-invariance is a very elegant property, but it is also a fragile one. In particular, it does not hold for most probability distributions, e.g., it holds for the Pareto distribution, but does not hold for the Exponential distribution.
To see that the Pareto is scale-invariant, recall that a Pareto distribution has for . Thus,
whenever It is also easy to see that the Exponential distribution is not scale-invariant. Recall that an Exponential distribution has for . Therefore,
Thus, there is not a choice for that is independent of .
The previous examples highlight that scale-invariance is an elegant property. But, perhaps surprisingly, it turns out that it is extremely special: distributions with “power-law tails,” i.e., tails that match the Pareto distribution up to a multiplicative constant, are the only scale-invariant distributions. That is, “scale-invariance” can be thought of interchangeably with “power-law.”
This may seem surprising at first, but the proof below highlights the reason for the equivalence pretty clearly.
Proof: Note that the case where is identically zero over trivially satisfies the conditions of the lemma (this corresponds to the case )
Excluding the above trivial case from consideration hereon, it is easy to see that must be non-zero for all Indeed, if for some then for any
It is well known that the only continuous non-zero functions satisfying the above condition are for some . Noting that for all we conclude that (since must be monotonically decreasing, with ). Therefore, for for some
We have just seen that all scale-invariant distributions are power-law distributions, a.k.a. distributions with tails matching a Pareto distribution up to a multiplicative constant. This makes scale-invariance a very fragile property that one should not expect to see in reality and, in the strictest sense, that is true. It is quite unusual for the distribution of an observed phenomenon to exactly match a power-law distribution, and thus be scale-invariant. Instead, what tends to be observed in practice is that the body of a distribution is not scale invariant, and the tail of a distribution is only approximately scale-invariant.
In the next post in this series, I’ll talk about how to formalize a notion of approximate scale invariance.