This is part II in a series on residual life, hazard rates, and long-tailed distributions. If you haven’t read part I yet, read that first! The previous post in this series highlighted that one must be careful in connecting “heavy-tailed” with the concepts of “increasing mean residual life” and “decreasing hazard rate.”
In particular, there are many examples of light-tailed distributions that are IMRL and DHR. However, if we think again about the informal examples that we discussed in the previous post, it becomes clear that IMRL and DHR are too “precise” to capture the phenomena that we were describing. For example, if we return to the case of waiting for a response to an email, it is not that we expect our remaining waiting time to be monotonically increasing as we wait. If fact, we are very likely to get a response quickly, so the expected waiting time should drop initially (and the hazard rate should increase initially). It is only after we have waited a “long” time already, in this case a few days, that we expect to see a dramatic increase in our residual life. Further, in the extreme, if we have not received a response in a month, we can reasonably expect that we may never receive a response, and so the mean residual life is, in some sense, growing unboundedly, or equivalently, the hazard rate is decreasing to zero. The example of waiting for a subway train highlights the same issues. Initially, we expect that the mean residual life should decrease, because if the train is on schedule, things are very predictable. However, once we have waited a long time beyond when the train was supposed to arrive, it likely means something went wrong, and could mean the train has had some sort of mechanical problem and will never arrive.
This is the third series of posts I’m writing on topics related to what we are covering in our book on heavy-tails (which I discussed in an earlier post). The first two were on the catastrohphe principle (subexponential distributions) and power laws (regularly varying distributions). This time I’ll focus on connections between residual life, hazard rate, and long tailed distributions.
Residual life in our daily lives
Over the course of our days we spend a lot of our time waiting for things — we wait for a table at restaurants, we wait for a subway train to show up, we wait for people to respond to our emails, etc. In such scenarios, we hold on to the belief that, as we wait, the likely amount of remaining time we will need to wait is getting smaller. For example, we believe that, if we have waited ten minutes for a table at a restaurant, the expected time we have left to wait should be smaller than it was when we arrived and that, if we have waited five minutes for the subway, then our expected remaining wait time should be less than it was when we arrived.
In many cases this belief holds true. For example, as other diners finish eating, our expected waiting time for a table at a restaurant drops. Similarly, subway trains follow a schedule with (nearly) deterministic gaps between trains and thus, as long as the train is on schedule, our expected remaining waiting time decreases as we wait. However, a startling aspect of heavy-tailed distributions is that this is not always true. For example, if you have waited a very long time past the scheduled arrival time for a subway train, then it is very likely that there was some failure and the train may take an extremely long time to arrive, and so your expected remaining waiting time has actually increased while you waited. Similarly, if you are waiting for a response to an email and have not heard for a few days, it is likely to be a very long time until a response comes (if it ever does).
In honor of the upcoming olympics, I figured I’d write a post highlighting something that JK, Bert, and I came up with in the process of writing our book on heavy tails.
One of the topics that is interwoven throughout the book is a connection between “extremal processes” and heavy-tails. In case you’re not familiar with extremal processes, the idea is that the process evolves as the max/min of a sequence of random variables. So, for example,
Of course, the canonical example of such processes is the evolution of world records. So, it felt like a good time to post about them here…
In part I and part II of this post, I went over the conspiracy and catastrophe principles informally and formally… But, since the book we’re writing is on heavy-tails, I figured I’d dwell a little longer on the catastrophe principle before moving on. In particular, I still have to get to the third part of the title: “subexponential distributions.”