I wrote last year about the mini-projects I use in my “flagship” course Networks: Structure & Algorithms. We recently finished the first of these miniprojects: rankmaniac.
In rankmaniac, students have to build a system (using Amazon Elastic MapReduce) to compute the nodes of a huge web crawl that have the top 20 pageranks, and to do the computation as quickly as possible. The system gets scored based on the sum of the mean-squared error of ranking of the 20 nodes output x 30sec + the computation time. Thus, there’s a tradeoff between accuracy and computing time, and the goal is to find (and efficiently implement) a parallel algorithm for computing approximate pageranks very quickly…
The students have one week to spend on this and, each night, their current solutions are run and the scores are posted on a public leaderboard. Their grades are not impacted by their ranking (only whether they pass two given milestones), but the team at the top of the leaderboard at the end of the week is crowned “rankmaniac” for the year.
This year’s competition
Every year, students try lots of interesting algorithmic approaches to the problem, but most years, it turns out that system optimizations win the day. For the scale of the problems that I can give them, a clean, simple, efficient, implementation can typically beat a complex algorithm… But, this year was very exciting because the winning team won because of a clever algorithmic optimization!
In particular, they used a nice idea from a paper of Kamvar et al at WWW: Extrapolation Methods for Accelerating PageRank Computations. The key idea in the paper is quite simple and effective: perform quadratic extrapolation to minimize the number of iterations computed by the Power Method. Intuitively, one can think of simply replacing computation steps in the Power Method with extrapolation steps every once in a while in order to speed up computation. You can’t do this too often, or things won’t converge, but you can do it often enough that significant speedups are possible.
In the case of rankmaniac, the extrapolation steps meant that the winning team could run fewer iterations than any of the other teams, which was enough to win the day!
We’re now well into the last third of my Networks course, which focuses on the interaction of economics and networks. We talk about a lot of different topics, but one that the students really seem to enjoy is ad auctions, which tends to take up the last three or four lectures of the class — and also serves as the context for our last competition / mini-project: clickmaniac.
I’ve already blogged once about the first miniproject of my class this term, rankmaniac, and now after a one-week break, it’s time for the second miniproject of the term: pandemaniac.
This is the part of the course that I’ve been looking forward to the most since the beginning of the term. We’ve been prepping the infrastructure for the assignment since the Fall and now, thanks to two impressive undergrads (Max Hirschhorn and Angela Gong), we have a nice system built that will allow the students in the class to seed competing cascades and then watch them play out step by step…but, I’m getting ahead of myself.
As I mentioned in an earlier post, I’m teaching my “flagship” course this winter, a unique networks course that focuses on the structure and economics of networks using both a theoretical and an applied perspective.
One of the unique things about the course is that we have three fun/challenging mini-project competitions: rankmaniac, pandemaniac, and clickmaniac. The first of these is rankmaniac, and it’s going out in class today…which means that the next week is one of my favorite parts of the class.
I’m very excited to announce that Caltech will introduce a new graduate degree next year: a PhD in Computing and Mathematical Sciences (CMS).
While we didn’t get the approval in time to advertise it before students applied this year, I cannot resist mentioning it right now, since I hope that some of the students that we admit to other programs at Caltech this year will choose to switch over and be part of it…
As I’m writing this, I’m enjoying the quiet solitude that comes from working on campus over a holiday break… I’m preparing for what is my “signature” course: Networks: Structure & Economics, which I’ve called “The ideas behind our Networked World” until this year. It’s a very unique course, and I’ll probably post a bit about it as the term starts. But, for now, I want to write about something I always struggle with as a teacher: class competitions.
I’ll start by saying that ever since I started teaching this class, I’ve always had at least two class-wide competitions, and this year, for the first time, I’ll have three.