Data Markets in the Cloud

Over the last year, while I haven’t been blogging, one of the new directions that we’ve started to look at in RSRG is “data markets”.

“Data Markets” is one of those phrases that means lots of different things to lots of different people.  At its simplest, the idea is that data is a commodity these days — data is bought and sold constantly. The challenge is that we don’t actually understand too much about data as an economic good.  In fact, it’s a very strange economic good and traditional economic theory doesn’t apply…

Continue reading

CMS Faculty Search is Live — Apply today!

I’m very happy to announce that our CMS department faculty search is live.  As in previous years, we’re searching broadly — truly broadly.  We’re looking across applied math and computer science both and expect to be able to make multiple offers.  We’re interested in candidates in a variety of core areas, from distributed systems and machine learning to statistics and optimization (and lots of other areas).  But, more generally, we look for impressive, high-impact work rather than enforcing preconceived notions of what is hot at the moment.  Beyond the core areas of applied math and computer science, we are hoping to see strong applications in areas on the periphery of computing and applied math too — candidates at the interface of EE, mechanical engineering, economics, privacy, biology, physics, etc. are definitely encouraged to apply!  As I said in my recent post, inventing new CS+X fields is something that Caltech excels at — it’s our brand.

Continue reading

Making Sigmetrics a “jourference”

The CFP for this year’s Sigmetrics is now being widely circulated and it includes something very new — it takes Sigmetrics a step towards the hybrid journal/conference, a.k.a., jourference model.   This represents the culmination of more than two years of discussions and work by the Sigmetrics board (of which I’m a part of), so I’m pretty excited to see how the experiment plays out!

Why go to the jourference model? 

For those who have somehow managed to avoid all the debates about the pluses and minus of the conference models in CS, I won’t rehash them here.  You can find in depth discussions here, here, here, and many other places…

Continue reading

(Nearly) A year later

It’s been one year since I started as executive officer (Caltech’s name for department chair) for our CMS department…and, not coincidentally, it’s been almost that long since my last blog post!  But now, a year in, I’ve got my administrative legs under me and I think I can get back to posting at least semi-regularly.

As always, the first post back after a long gap is a news filled one, so here goes!

Caltech had an amazing faculty recruitment year last year!  Caltech’s claim to fame in computer science has always been pioneering disruptive new fields at the interface of computing — quantum computing, dna computing, sparsity and compressed sensing, algorithmic game theory, … Well, this year we began an institute-wide initiative to redouble our efforts on this front and it yielded big rewards.  We hired six new mid-career faculty at the interface of computer science!  That is an enormous number for Caltech, where the whole place only has 300 faculty…

Continue reading

The Forgotten Data Centers

Data centers are where the Internet and cloud services live, and so they have been getting lots of public attention in recent years. If we read technology news or research papers, it’s not uncommon that we see IT giants, like Google and Facebook, publicly discuss and share the designs of mega-scale data centers they operate. But, another important type of data center –– multi-tenant data center, or commonly called “colocation”/”colo” –– has been largely hidden from the public and rarely discussed (at least in research papers), although it’s very common in practice and located almost everywhere, from Silicon Valley to the gambling capital, Las Vegas.

Unlike a Google-type data center where the operator manages both IT equipment and the facility, multi-tenant data center is a shared facility where multiple tenants house their own servers in shared space and the data center operator is mainly responsible for facility support (like power, cooling, and space). Although the boundary is blurring, multi-tenant data centers can be generally classified as either a wholesale data center or a retail data center: wholesale data centers (like Digital Realty) primarily serve large tenants, each having a power demand of 500kW or more, while retail data centers (like Equinix) mostly target tenants with smaller demands.

Continue reading

Reporting from SoCal NEGT

Last week, USC hosted our annual Southern California Network Economics and Game Theory (NEGT) workshop.  (Thanks to David Kempe and Shaddin Dughmi for all the organization this year!)  It’s always a very fun workshop, and really does a great job in ensuring a multidisciplinary community around CS, EE, and Econ in the LA area.  We’ve been doing it for so long now that the faculty & students really know each other well at this point…

As always, there were lots of great talks.  In particular, we had a great set of keynotes again this year.

Continue reading

Introducing DOLCIT

At long last, we have gotten together and created a “Caltech-style” machine learning / big data / optimization group, and it’s called DOLCIT: Decision, Optimization, and Learning at the California Institute of Technology.  The goal of the group is to take a broad and integrated view of research in data-driven intelligent systems. On the one hand, statistical machine learning is required to extract knowledge in the form of data-driven models. On the other hand, statistical decision theory is required to intelligently plan and make decisions given imperfect knowledge. Supporting both thrusts is optimization.  DOLCIT envisions a world where intelligent systems seamlessly integrate learning and planning, as well as automatically balance computational and statistical tradeoffs in the underlying optimization problems.

In the Caltech style, research in DOLCIT spans traditional areas from applied math (e.g., statistics and optimization) to computer science (e.g., machine learning and distributed systems) to electrical engineering (e.g., signal processing and information theory). Further, we will look broadly at applications spanning information and communication systems to the physical sciences (neuroscience and biology) to social systems (economic markets and personalized medicine).

In some sense, the only thing that’s new is the name, since we’ve been doing all these things for years already.  However, with the new name will come new activities like seminars, workshops, etc.  It’ll be exciting to see how it morphs in the future!

(And, don’t worry, RSRG is still going strong — RSRG and DOLCIT should be complementary with their similar research style but differing focuses with respect to tools and applications.)