Simons Workshop on Big Data and Differential Privacy

I recently returned from a workshop on Big Data and Differential Privacy, hosted by the Simons Institute for the Theory of Computing, at Berkeley.

Differential privacy is a rigorous notion of database privacy intended to give meaningful guarantees to individuals whose personal data are used in computations, where “computations” is quite broadly understood—statistical analyses, model fitting, policy decisions, release of “anonymized” datasets,…

Privacy is easy to get wrong, even when data-use decisions are being made by well-intentioned, smart people. There are just so many subtleties, and it is impossible to fully anticipate the range of attacks and outside information an adversary might use to compromise the information you choose to publish. Thus, much of the power of differential privacy comes from the fact that it gives guarantees that hold up without making any assumptions about the attacks the adversary might use, her computational power, or any outside information she might acquire. It also has elegant composition properties (helping us understand how privacy losses accumulate over multiple computations).

And, in recent years, the toolkit of algorithms and techniques for preserving differential privacy has grown substantially; for many typical computations on large datasets, we know how to execute them in a differentially private manner, often with very little impact on accuracy.

Sound too good to be true? Check out the workshop talks, all of which were recorded and are available here.

If you’re not familiar with differential privacy and some of the recent techniques for preserving it (we’ve gone far beyond simple noise addition!), you might start with my tutorial, from the first day.

For those who prefer to read rather than watch, I have a slightly less technical introduction to differential privacy, co-authored with Ori Heffetz.
That paper is written for economists who work with data, but it should be accessible and relevant to a much wider audience.

One of the challenges that the field of differential privacy faces in the coming years is moving from theory to widespread adoption. This was a frequent topic of discussion among workshop participants, and I expect I’ll comment more on it in future posts. The first step toward adoption, though, is awareness; I hope that this blog post contributes a bit to that!

Advertisements

One thought on “Simons Workshop on Big Data and Differential Privacy

  1. Pingback: Rigor + Relevance | Data,Privacy, and Markets

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s