Some thoughts on broad privacy research strategy

Let me begin by saying where I think the interesting privacy research question does not lie. The interesting question is not how do people and organizations currently behave with respect to private information. Current behaviors are a reflection of culture, legislation, and policy, and all of these have proven themselves to be quite malleable, in our current environment. So the interesting question when it comes to private information is—how could and should people and organizations behave, and what options could or should they even have? This is a fundamental and part-normative question, and one that we cannot address without a substantial research effort. Despite being part-normative, this question can be useful in suggesting directions for even quite mathematical and applied research.

The first thing I’d like to ask is, What do we need to understand better in order to decide how to address this question? I see three relevant types of research that are largely missing:
1. We need a better understanding of the utility and harm that individuals, organizations, and society can potentially incur from the use of potentially sensitive data.
2. We need a better understanding of what the options for behavior could look like—which means we need to be open to a complete reinvention of the means by which we store, share, buy, sell, track, compute on, and draw conclusions from potentially sensitive data. Thus, we need a research agenda that helps us understand the realm of possibilities, and the consequences such possibilities would have.
3. It is, of course, important to remember the cultural, legislative, and policy context. It’s not enough to understand what people want and what is feasible. If we care about actual implementation, we must consider this broader context.

The first two of these points can and must be addressed with mathematical rigor, incorporating the perspectives of a wide variety of disciplines. Mathematical rigor is essential for a number of reasons, but the clearest one is that privacy is not an area where we can afford to deploy heuristic solutions and then cross our fingers. While inaccurate computations can later be redone for higher accuracy, and slow systems can later be optimized for better performance, privacy, once lost, cannot be “taken back.”

The second point offers the widest and richest array of research challenges. The primary work to address them will involve the development of new theoretical foundations for the technologies that would support these various interactions on potentially sensitive data.

For concreteness, let me give a few example research questions that fall under the umbrella of this second point:
1. What must be revealed about an individual’s medical data in order for her to benefit from and contribute to advances in medicine? How can we optimize the tradeoff of these benefits against potential privacy losses and help individuals make the relevant decisions?
2. When an offer of insurance is based on an individual’s history, how can this be made transparent to the individual? Would such transparency introduce incentives to “game” the system by withholding information, changing behaviors, or fabricating one’s history? What would be the impact of such incentives for misbehavior, and how should we deal with them?
3. How could we track the flow of “value” and “harm” through systems that transport large amounts of personal data (for example, the system of companies that buy and sell information on individuals’ online behavior)? How does this suggest that such systems might be redesigned?

Data, Privacy, and Markets

We’ve posted in the past about some of the work going on in our group related to privacy, and of course there are always lots of news articles popping up.  But, today I came across a recent animation by Jorge Cham of PhD comics fame that does a very nice job of summarizing one of the interesting directions these days — managing the interaction of personal data with data marketplaces.

Though I haven’t posted about it here yet, this is one of the new directions RSRG is moving in — how does one design a data marketplace that allows the transition from data as a commodity to data as a service?  We have already seen computing go from commodity to service with the emergence of cloud infrastructure providers like Amazon EC2 and Microsoft Azure, and I think it won’t be long until data makes the same transition.  But, in setting up these data marketplaces, how does one manage issues such as privacy?  and how does one place a value on pieces of data, which have many different uses?

In any case, enjoy the animation!

QUESTA Special Issue on Cloud Computing

I’ve been meaning to post about this for a while, but better late then never I guess!   Javad Ghaderi, Sanjay Shakkottai, Sasha Stolyar, and I are editing a special issue at QUESTA on Cloud Computing.  The issue is devoted to modeling and theoretical analysis of algorithm design, market issues, and performance challenges in cloud systems. So, the scope is quite broad but, of course, being QUESTA, we are interested in papers that develop new analytic tools and techniques for this domain in, e.g., areas such as stochastic processes and scheduling theory.

The deadline for papers is April 1st, so I apologize for posting this so late.  But, I hope to see lots of great submissions!

You can find the full details for submission, formatting, etc., here.