This is part II in a series on residual life, hazard rates, and long-tailed distributions. If you haven’t read part I yet, read that first! The previous post in this series highlighted that one must be careful in connecting “heavy-tailed” with the concepts of “increasing mean residual life” and “decreasing hazard rate.”
In particular, there are many examples of light-tailed distributions that are IMRL and DHR. However, if we think again about the informal examples that we discussed in the previous post, it becomes clear that IMRL and DHR are too “precise” to capture the phenomena that we were describing. For example, if we return to the case of waiting for a response to an email, it is not that we expect our remaining waiting time to be monotonically increasing as we wait. If fact, we are very likely to get a response quickly, so the expected waiting time should drop initially (and the hazard rate should increase initially). It is only after we have waited a “long” time already, in this case a few days, that we expect to see a dramatic increase in our residual life. Further, in the extreme, if we have not received a response in a month, we can reasonably expect that we may never receive a response, and so the mean residual life is, in some sense, growing unboundedly, or equivalently, the hazard rate is decreasing to zero. The example of waiting for a subway train highlights the same issues. Initially, we expect that the mean residual life should decrease, because if the train is on schedule, things are very predictable. However, once we have waited a long time beyond when the train was supposed to arrive, it likely means something went wrong, and could mean the train has had some sort of mechanical problem and will never arrive.
This is the third series of posts I’m writing on topics related to what we are covering in our book on heavy-tails (which I discussed in an earlier post). The first two were on the catastrohphe principle (subexponential distributions) and power laws (regularly varying distributions). This time I’ll focus on connections between residual life, hazard rate, and long tailed distributions.
Residual life in our daily lives
Over the course of our days we spend a lot of our time waiting for things — we wait for a table at restaurants, we wait for a subway train to show up, we wait for people to respond to our emails, etc. In such scenarios, we hold on to the belief that, as we wait, the likely amount of remaining time we will need to wait is getting smaller. For example, we believe that, if we have waited ten minutes for a table at a restaurant, the expected time we have left to wait should be smaller than it was when we arrived and that, if we have waited five minutes for the subway, then our expected remaining wait time should be less than it was when we arrived.
In many cases this belief holds true. For example, as other diners finish eating, our expected waiting time for a table at a restaurant drops. Similarly, subway trains follow a schedule with (nearly) deterministic gaps between trains and thus, as long as the train is on schedule, our expected remaining waiting time decreases as we wait. However, a startling aspect of heavy-tailed distributions is that this is not always true. For example, if you have waited a very long time past the scheduled arrival time for a subway train, then it is very likely that there was some failure and the train may take an extremely long time to arrive, and so your expected remaining waiting time has actually increased while you waited. Similarly, if you are waiting for a response to an email and have not heard for a few days, it is likely to be a very long time until a response comes (if it ever does).
This is part III in a series on scale invariance, power laws, and regular variation, so you should definitely click on over to parts I and II if you haven’t read those yet.
In part II, we showed that the class of regularly varying distributions formalizes the notion of “approximately scale-invariant,” just like power-law tails formalize the notion of scale-invariant. The fact that regularly-varying distributions are exactly those distributions that are asymptotically scale-free suggests that they, in a sense, should be able to be analyzed (at least asymptotically) like they are simply power-law distributions. In fact, this can be formalized explicitly, and regularly varying distributions can be analyzed nearly as if they were power-law (Pareto) distributions as far as the tail is concerned. This makes them remarkably easy to work with and highlights that the added generality from working with the class of regularly-varying distributions, as opposed to working specifically with Pareto distributions, comes without too much added complexity.
This is part II in a series on scale invariance, power laws, and regular variation, so you should definitely click on over to part I if you haven’t read that. In part I, we talked about a formalization of the notion of scale invariance and showed that a distribution is scale-invariant if and only if it has a power-law tail. This highlights that scale invariance is a very fragile property that one should not expect to see in reality and, in the strictest sense, that is true. It is quite unusual for the distribution of an observed phenomenon to exactly match a power-law distribution, and thus be scale-invariant. Instead, what tends to be observed in practice is that the body of a distribution is not scale-invariant, and the tail of a distribution is only approximately scale-invariant. Thus, it is natural to focus on distributions that have asymptotically scale-invariant tails, rather than imposing exact scale invariance.
This is the second series of posts I’m writing on topics related to what we are covering in our book on heavy-tails (which I discussed in an earlier post). The first was on the catastrophe principle (subexponential distributions) and now we move to one of the most commonly discussed aspects of heavy-tailed distributions: power laws and scale invariance.
Scale invariance in our daily lives
In our daily lives, many things that we come across have a typical size, or “scale,” that we associate with them. For example, the ratio of the maximum to minimum heights and weights that we see in a given day is usually less than 3, so none deviates too much from the population average. In contrast, the ratio of the maximum to minimum income of people we see in a particular day may often be 100 or more! This contrast is a consequence of the fact that light-tailed distributions, such as heights and weights, tend to have a “typical scale,” while many heavy-tailed distributions, such as incomes, are “scale invariant,” i.e., regardless of the scale on which you look at them, they look the same.
Upon first encounter, scale invariance is a particularly mysterious aspect of heavy-tailed distributions, since it is natural to think of the average of a distribution as a good predictor of what samples will occur. The fact that this is no longer true for scale invariant distributions leads to counter-intuitive properties. For example, consider the old economics joke: “If Bill Gates walks into a bar, on average, everybody in the bar is a millionaire.”
Though initially mysterious, scale invariance is a beautiful and widely-observed phenomenon that has received attention broadly beyond mathematics and statistics, e.g., in physics, computer
science, and economics.
In part I and part II of this post, I went over the conspiracy and catastrophe principles informally and formally… But, since the book we’re writing is on heavy-tails, I figured I’d dwell a little longer on the catastrophe principle before moving on. In particular, I still have to get to the third part of the title: “subexponential distributions.”
In part I of this post, I described the conspiracy and catastrophe principles informally. However, as I mentioned, these principles can be made rigorous, and can serve as powerful analytic tools when studying heavy-tailed and light-tailed distributions.
It is important to note that there is not really one catastrophe principle and one conspiracy principle. Instead, there are many variations of these principles that can be defined and used, each with varying strengths and generality. In this post, I’ll introduce the simplest statements of each in order to highlight how these properties can be formalized. You can see the book for other versions…