I just wanted to say thanks to the dozen or so folks that have emailed corrections for typos in the chapters of our book that I’ve posted so far… And, we’ve now posted another chapter. This chapter is on “multiplicative processes”. For me, multiplicative processes are almost always the answer to the question of “where do heavy-tails come from?” So, this is a pretty important chapter for the book…and for developing intuition about when to expect heavy-tailed distributions to emerge.
You can download the new chapter here.
I look forward to your comments! None of the chapters have been properly edited yet, so I apologize in advance for the typos you’ll surely find…
As the summer winds down, I’m very happy to report that we actually made measurable progress on our book project this summer!
I’ve been working on a book “The Fundamentals of Heavy Tails” for more years than I’d like to admit now, but progress has been very slow — especially since I became department chair. Well, this summer I finally managed to put aside a day a week for book writing and it paid off! We’ve completed drafts of two chapters and have two more that are nearly complete!
Many of the other chapters just require polishing too! So, hopefully, we’ll be able to continue to release chapters during the fall…
In the meantime, I’d love to get comments on the completed chapters, so please take a look and then write to me. You can download chapters here. (We ask that you fill out a form so that we can contact you with updates as we release more chapters and complete the book.)
I wrote last year about the mini-projects I use in my “flagship” course Networks: Structure & Algorithms. We recently finished the first of these miniprojects: rankmaniac.
In rankmaniac, students have to build a system (using Amazon Elastic MapReduce) to compute the nodes of a huge web crawl that have the top 20 pageranks, and to do the computation as quickly as possible. The system gets scored based on the sum of the mean-squared error of ranking of the 20 nodes output x 30sec + the computation time. Thus, there’s a tradeoff between accuracy and computing time, and the goal is to find (and efficiently implement) a parallel algorithm for computing approximate pageranks very quickly…
The students have one week to spend on this and, each night, their current solutions are run and the scores are posted on a public leaderboard. Their grades are not impacted by their ranking (only whether they pass two given milestones), but the team at the top of the leaderboard at the end of the week is crowned “rankmaniac” for the year.
This year’s competition
Every year, students try lots of interesting algorithmic approaches to the problem, but most years, it turns out that system optimizations win the day. For the scale of the problems that I can give them, a clean, simple, efficient, implementation can typically beat a complex algorithm… But, this year was very exciting because the winning team won because of a clever algorithmic optimization!
In particular, they used a nice idea from a paper of Kamvar et al at WWW: Extrapolation Methods for Accelerating PageRank Computations. The key idea in the paper is quite simple and effective: perform quadratic extrapolation to minimize the number of iterations computed by the Power Method. Intuitively, one can think of simply replacing computation steps in the Power Method with extrapolation steps every once in a while in order to speed up computation. You can’t do this too often, or things won’t converge, but you can do it often enough that significant speedups are possible.
In the case of rankmaniac, the extrapolation steps meant that the winning team could run fewer iterations than any of the other teams, which was enough to win the day!
We’re now well into the last third of my Networks course, which focuses on the interaction of economics and networks. We talk about a lot of different topics, but one that the students really seem to enjoy is ad auctions, which tends to take up the last three or four lectures of the class — and also serves as the context for our last competition / mini-project: clickmaniac.
I’ve already blogged once about the first miniproject of my class this term, rankmaniac, and now after a one-week break, it’s time for the second miniproject of the term: pandemaniac.
This is the part of the course that I’ve been looking forward to the most since the beginning of the term. We’ve been prepping the infrastructure for the assignment since the Fall and now, thanks to two impressive undergrads (Max Hirschhorn and Angela Gong), we have a nice system built that will allow the students in the class to seed competing cascades and then watch them play out step by step…but, I’m getting ahead of myself.
As I mentioned in an earlier post, I’m teaching my “flagship” course this winter, a unique networks course that focuses on the structure and economics of networks using both a theoretical and an applied perspective.
One of the unique things about the course is that we have three fun/challenging mini-project competitions: rankmaniac, pandemaniac, and clickmaniac. The first of these is rankmaniac, and it’s going out in class today…which means that the next week is one of my favorite parts of the class.
As I’m writing this, I’m enjoying the quiet solitude that comes from working on campus over a holiday break… I’m preparing for what is my “signature” course: Networks: Structure & Economics, which I’ve called “The ideas behind our Networked World” until this year. It’s a very unique course, and I’ll probably post a bit about it as the term starts. But, for now, I want to write about something I always struggle with as a teacher: class competitions.
I’ll start by saying that ever since I started teaching this class, I’ve always had at least two class-wide competitions, and this year, for the first time, I’ll have three.