I recently returned from a workshop on Big Data and Differential Privacy, hosted by the Simons Institute for the Theory of Computing, at Berkeley.

Differential privacy is a rigorous notion of database privacy intended to give meaningful guarantees to individuals whose personal data are used in computations, where “computations” is quite broadly understood—statistical analyses, model fitting, policy decisions, release of “anonymized” datasets,…

Privacy is easy to get wrong, even when data-use decisions are being made by well-intentioned, smart people. There are just so many subtleties, and it is impossible to fully anticipate the range of attacks and outside information an adversary might use to compromise the information you choose to publish. Thus, much of the power of differential privacy comes from the fact that it gives guarantees that hold up without making any assumptions about the attacks the adversary might use, her computational power, or any outside information she might acquire. It also has elegant composition properties (helping us understand how privacy losses accumulate over multiple computations).