Tuesday, December 20, 2011

Is that all that's left?

2011 has come and almost gone, and I've already forgotten most of it. It's always been that way. I can barely remember my own life. No one else will remember it either. Most of humanity has lived and died and left little more lasting traces of its existence than crickets in a summer field.

Despite our collective social fears of data deluge and "the age of big data", the reality is that we're probably the last generation in human history that will disappear with relatively little trace. As I troll the web today, I don't find much about myself: a few dozen YouTube video clips, a few hundred photos, my blog postings, a few thousand media quotes. Frankly, it really doesn't amount to all that much. It's barely a sliver of my life. In the future, digital archeologists will try to understand our generation, making sense of these digital fragments of our generation, the last lost generation.

The current privacy debates about particular technologies will seem oddly quaint in a few years. I remember a time only a few years ago when serious people thought a spam filter in email must be an invasion of privacy, since a machine was doing the filtering. Now we're debating whether users should click on a pop-up screen for cookies. A decade from now, we'll laugh, I think, about the current fears of digital over-exposure, based on today's trivia: posting a photo to the web, or tweeting, or blogging, or sharing location info with friends, or whatever. Of course, some things shouldn't be published or shared, because they are hurtful or embarrassing. But the scale of data and technology is changing so fundamentally that the importance of a particular piece of data today is almost unknowable.

I'm sure that more and more data will be shared and published, sometimes openly to the Web, and sometimes privately to a community of friends or family. But the trend is clear. Most of the sharing will be utterly boring: nope, I don't care what you had for breakfast today. But what is boring individually can be fascinating in crowd-sourcing terms, as big data analysis discovers ever more insights into human nature, health, and economics from mountains of seemingly banal data bits. We already know that some data sets hold vast information, but we've barely begun to know how to read them yet, like genomes. Data holds massive knowledge and value, even, perhaps especially, when we do not yet know how to read it. Maybe it's a mistake to try to minimize data generation and retention. Maybe the privacy community's shibboleth of data deletion is a crime against science, in ways that we don't even understand yet.

Assuming I live a normal lifespan, I will live to be able to up-load my life memories to remote storage. I'll be able to start real-time recording of my experience of life, and to store it, share it, and edit it. My perceptions, thoughts, and memory, will be enhanced by machines guided by artificial intelligence. Perhaps it's human vanity, but I want to have the choice to store and share my life, before or after its biological limits are extinguished. I am already losing clear memories of my youth, and of places I've been, and people I've loved. What I've lost is lost forever. There was no back-up disk. That's not my idea of privacy, but privation. I suspect a future privacy debate will discuss whether "memory deletion" is a fundamental human right, or deeply anti-social.

I have no idea what this future will look like, or whether humans and society can adapt to it as quickly as the technology will enable it. But as the year draws to a close, I am grateful for a front row seat, hoping to live long enough to see a world of technologies that will stop me from just disappearing from the planet, without anything more than a few random photos and video clips, as part of the last human generation whose evanescent lives left almost no traces, disappearing from the earth like crickets at the end of summer.

Wednesday, November 23, 2011

Data Protection Officers: on solid ground?

I've worked in the field of privacy long enough to remember a time when almost no companies in the world had privacy officers. Now, almost all big companies do. And soon, Europe's privacy laws are likely to be amended in a way to mandate them, or at least to provide strong incentives to appoint them, which will lead to massive growth in this profession.

But what is a data protection officer? Or can we even agree on what to call them? "Data Protection Officer" or "DPO" is a euro-centric title, since Europe long ago invented the concept of "data protection" as an alternative (not synonym) for "privacy". Personally, I have long used the title "Global Privacy Counsel", since I think it's useful to express three things that define my job, namely, the topic (privacy), the geographic scope (global) and the functional perspective (namely, counsel, or lawyer). But privacy leaders are often not lawyers, and hence, use different monikers, ranging from Chief Privacy Officer to Director of Privacy Engineering, or Director of Privacy Compliance, or Chief Privacy Evangelist, in each case stressing a different functional perspective.

For very large companies, privacy needs to be a cross-functional effort, representing security, engineering, legal, compliance, policy and communications. Personally, I focus on the legal/regulatory/policy sides of privacy. For very large information-based Internet companies, literally hundreds of people work on privacy, across these different functions. For smaller companies, in my opinion, there should be at least one person who is accountable for privacy, in some sense, even if it's not a full-time job.

As Europe is on the verge of mandating "data protection officers", we need to understand what exactly these people will be accountable for. First, it's important to note that the European proposal will probably be modeled on the existing functions in France ("correspondent") and Germany ("Datenschutzbeauftragte"). In these countries, the DPO is responsible for supervising their companies' creation and use of databases of personal data, liaising with government privacy regulators, and providing good privacy advice and guidance. In practice, DPOs in Germany and France are sometimes focused on the legal side, and sometimes on the technical/security side.

In the US, there is a different vision of privacy leaders. At most US companies, lawyers play this role, just as I came to privacy through the legal profession. And we play this role in our capacity as lawyers, namely, providing privacy legal advice to our companies. As privacy lawyers, we provide advice, but are not empowered to make final decisions about whether or not our companies will follow our advice. The companies' executives are the decision-makers, ultimately, not the privacy lawyers. There are of course other models at some US companies, but they're still in the minority.

So, as Europe institutionalizes the role of DPO, it will be important to define what exactly these people will be accountable for, seen from inside and outside their companies. For multinationals, it will take some time to work out how to support their privacy leaders under these different legal regimes as they straddle jurisdictions. And as DPOs are held accountable for certain areas, they too may need protection and indemnification from their companies for personal liability, just like other professions, such as chief financial officers who are mandated by various laws with specific areas of accountability.

I welcome laws in Europe that will help strengthen the role of DPOs in their companies, and will help make DPOs more prevalent across industry. This will be a practical step forward for privacy. But at the same time, it will be important to define what we're accountable for, internally and externally, especially in a field where the very notion of "privacy" is highly subjective, and where the visions of what a privacy leader is supposed to do diverge dramatically, by country, by industry, and by function.

Thursday, September 8, 2011

My Italian Appeal

A lot of you have wondered about the status of the appeal of my Italian conviction. So, here's a short update, just on some logistical points.

There have been some changes to my legal defense team. First, I'd like to congratulate one of the defense team's members, Giuliano Pisapia, on his recent election as Mayor of Milan. Sadly for me, of course, he will be withdrawing from the legal team. But I'm delighted that Giulia Bongiorno and Carlo Blengino have joined my team. Giulia will be fully on board once her work in the Amanda Knox/ Raffaele Sollecito appeal winds down.

Preliminary appeal briefs have been filed with the Milan appeals court, but the appeal has not yet been assigned to individual appeals court judges. Once that happens, the judges will decide on a hearing schedule. So, realistically, I am not expecting the hearings to begin until later this fall. I have no insights into how many hearings will be held, nor when they might be held.

Wednesday, September 7, 2011

September 11

September 11, seen 10 years later, changed many things in the world, in geo-political terms. Some people also think it changed the nature of privacy too, since it gave rise to the Patriot Act.

I can't think of any topic in the field of privacy that has been more polemicized and politicized and distorted than discussions about the Patriot Act. Most discussions about it are simply factually and legally wrong. I respect Microsoft for blogging and explaining this. It takes courage to talk about this issue, since so many people around the world have passionate reasons to want to resist or restrict the power of (some, all, or just the US) governments to use valid legal process to access data.

Over and over again, I read about people and politicians around the world saying that they want their data to be stored in the cloud (i.e., in a data center) in their country/Continent, so that it's protected from American law enforcement under the Patriot Act. This is a common refrain, for example, in Europe and Canada. Indeed, it has given rise to an entire industry purporting to offer "euro-clouds".

Therefore, it's perhaps surprising for some people to learn that the location of storage of the data has no impact on this issue, with regards to US-headquartered companies. It has limited impact on this issue, with regards to non-US headquartered companies. I won't repeat the legal analysis, since Microsoft's blog did a good job in explaining it.

It's well-known that global cloud-service providers maintain data centers around the world, mostly to ensure that their services operate with efficiency, speed and reliability. But they don't, and can't, operate as tools to evade or circumvent valid US government access to information, whether under the Patriot Act or any of its related/predecessor laws, since the location of data within the cloud is simply not a relevant legal factor. I know that's controversial, but it's also a legal fact, so kudos to Microsoft for saying it publicly.

Monday, September 5, 2011

"The Right to be Forgotten", seen from Spain

I'd like to share some personal musings about an interesting series of court cases pending in Spain, pitting the "right to be forgotten" against the right to freedom of expression. The New York Times reported on this debate recently. In a nutshell, the cases ask the question whether people can demand that search engines delete content from their indexes, even if the content is true and the third-party site that published it clearly has the right to publish it (e.g., newspapers).

Virtually everyone uses search engines to find information on the web. There are way over a trillion pages on the web today. To help people find what they're looking for in the vastness of the web, search engines create giant indexes of the web. Search engines are intermediaries, since they don't create, select or edit the content on the web sites they index. Search engines try to match a user's search query with the search results most likely to be relevant, using complex algorithms to rank the likely relevance of a particular webpage. The vast majority of websites want to appear in search engine indexes, but if they don't want to be included in the index, they can use a simple tool, called robots.txt, to opt-out of being indexed by all leading searching engines.

Many websites publish information about people, and sometimes this information can be hurtful to a person's sense of privacy or reputation. For example, government websites or newspapers may publish information about criminal convictions or accusations of medical malpractice. People who feel that information about them was wrongly published by these web sites can always ask them to correct or delete it. But newspapers and government websites usually have published this information legally, or indeed may even be legally obligated to publish it, or may be exercizing their rights of freedom of expression. As search engine intermediaries, Google and other search engines play no role in what these web sites publish, or in deciding whether they should revise or remove content based on someone's privacy claim against them.

That's why I think it's wrong that the Spanish Data Protection Authority has launched over a hundred different privacy suits against Google, demanding that Google delete web sites from its index, even though the original websites that published the information (including Spanish newspapers and Spanish official government journals) published that information legally and continue to offer it. The legal question is important: should search engines like Google be responsible for the content of the web sites that they index? Should Google be forced to remove links from its search index, in the name of privacy, even if the websites that published it want to be included in its search index and the content is legal? Should search engines be used to make information harder to find, even if the information is legally published?

I have great sympathy with people who feel their privacy has been invaded by a web site that publishes information about them. But search engines shouldn't be asked to delete links to legal content that is published by a third-party website. These cases have sometimes been referred to as about the "right to be forgotten". In fact, these cases are not about deleting or "forgetting" content, but just about making it harder to find content. These cases would make it impossible for users to use search engines to find content that otherwise continues to exist on the web.

It's not hard to imagine the negative consequences for freedom of expression, if search engines could be ordered to delete links to any website that publishes content about a person that is deemed to have invaded someone's privacy. The debate about privacy v freedom of expression is an important and timeless debate, which is becoming more urgent in the age of the Internet. But it's wrong to try to use search engines to try to make legal information harder to find. It's wrong to use search engines as a indirect tool of censorship, since European law rightly holds the publisher of material is responsible for its content. Requiring intermediaries like search engines to censor material published by others would have a profound chilling effect on freedom of expression.

There are better ways to protect privacy online, by remembering that it should be the publisher of content who is responsible for it. Interestingly, the Spanish Data Protection Authority seems to be coming around to this conclusion itself. It recently issued a resolution ordering a website to use the robots.txt protocol to exclude some of its pages from search engine indexes. That's exactly the right approach. Now, the debate will turn to the websites that receive such orders: should they exclude some of their pages from search engine indexes, in the name of privacy, or should they refuse, in the name of freedom of expression? Newspapers worldwide, and in particular their online archives, will soon be in the middle of this debate. I believe that Spanish papers, like El Pais, are now respecting such orders. I would wager that The New York Times wouldn't, based on their reporting on Two German Killers demanding Anonymity Sue Wikipedia's Parent.

This is a difficult debate, and I'm sure that different publishers will come to different conclusions about it. That's how it should be.

Tuesday, May 17, 2011

Trying to define “sensitive” data

Privacy laws need to ensure that there is a higher level of privacy protection for everyone’s sensitive personal data. There's universal consensus on that. So, it’s very important for laws to do a good job defining what should be considered “sensitive personal data”. It’s quite instructive to compare Europe’s definition (from 1995) with India’s (from 2011).

The European Data Protection Directive defines them as:

“personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life.”

As I read this list, and having worked with its concepts for years, I find it quite unsatisfying. It is both far too broad, and far too narrow, at the same time. It’s far too broad, because it seems to extend exceptional privacy legal protection to banal and often public things, like “political opinions”, or “racial origin” when any photo of me will show I’m a white dude. And things like “trade union membership” or “racial origins” probably should not be protected by privacy laws, but rather by labor laws or anti-discrimination laws, as they generally already are. But it’s also far too narrow, because the European definition of sensitive personal data fails to include something as strikingly sensitive as, say, genetic data, or biometrics. Granted, the laws in some individual European countries got this right, like France, which already treats biometrics as sensitive. In my opinion, in the future, genetic/biometric data will become the most important category of what should be treated as sensitive, so laws that don’t include biometrics in the category of sensitive data have a big gap. Strangely, European law also does not include sensitive personal financial information in its list of “sensitive” categories.

Now, for comparison, here is India’s just revised categories of “sensitive” data:

unless freely available in the public domain or otherwise available under law, SPDI under the Rules is personal information which consists of information relating to:


financial information such as bank account, credit or debit card details as well as other payment instrument details,

physical, physiological and mental health condition,

sexual orientation,

medical records and history,

Biometric information (a defined term including fingerprints, eye retinas and irises, voice and facial patterns, hand measurements and DNA),

Any detail relating to the above when supplied for providing service, and

Any of the information described above received by an organization for processing, stored or processed under lawful contract or otherwise. “

When India drafted its privacy laws, it looked to Europe’s Directive, both for inspiration and to protect its out-sourcing industry. But Europe would do well to look to India for inspiration about how to modernize our data protection concepts. India's list of "sensitive personal data" strikes me as much more modern and relevant to privacy than the legacy of what we have in Europe.

Friday, March 11, 2011

France re-writes the rules of data retention

When Europe introduced a Data Retention Directive in 2006, it struck a very very careful political and legal balance between the interests of privacy and the interests of Law Enforcement/ Government access to data. The core distinction of the laws was to impose an obligation on service providers to retain and produce traffic data relating to communications, but to exclude contents of communications. Notwithstanding this careful balance, the Directive has always been highly controversial. There has been a long debate about whether this Directive, and the balance it struck, is Constitutional under national privacy laws, and indeed, last year its German-implementation was held un-constitutional by the German Constitutional Court.

Surprisingly, very few people have noticed what just happened in France. The law (decree, technically) adopted a few days ago in France up-ended the careful political/legal balance of the Directive by inserting one little word: "passwords". In other words, passwords are added to the list of "traffic data" that ISPs have to retain and produce to the French police on demand. Interestingly, the version of the law that had been circulating for discussion in France for the last two years, and which was reviewed by the French privacy authority the CNIL and by industry associations, did not contain that little word "password". The word "password" was inserted at the last minute, with no public or privacy review, as far as I can tell.

Stop to reflect for just a minute. Why would police want a password and what would they do with it? Well, obviously, they would use it to look at "content" of communications. In other words, a password would grant them access to all the things that the Directive explicitly chose not to subject to Data Retention in the interests of privacy.

All the years of work by privacy advocates has been chucked aside, in one little word. Well, three in French: "mot de passe".

I'm sure legal challenges to this French law will not be far behind. Curiously, only a few lone voices in the press or advocacy community seem to have noticed all this.

Wednesday, March 9, 2011

Foggy thinking about the Right to Oblivion

I was lucky enough to spend a few days in Switzerland working on Street View. And I treated myself to a weekend of skiing too. The weather wasn't great, we had a lot of mountain fog, but then, the entire privacy world seems to be sort of foggy these days.

In privacy circles, everybody's talking about the Right to be Forgotten. The European Commission has even proposed that the "right to be forgotten" should be written into the up-coming revision of the Privacy Directive. Originally, a rather curious French "universal right" that doesn't even have a proper English-translation (right to be forgotten? right to oblivion? right to delete?), le Doit a l'Oubli, is going mainstream. But, what on earth is it? For most people, I think it's an attempt to give people the right to wash away digital muck, or delete the embarrassing stuff, or just start fresh. But unfortunately, it's more complicated than that.

More and more, privacy is being used to justify censorship. In a sense, privacy depends on keeping some things private, in other words, hidden, restricted, or deleted. And in a world where ever more content is coming online, and where ever more content is find-able and share-able, it's also natural that the privacy counter-movement is gathering strength. Privacy is the new black in censorship fashions. It used to be that people would invoke libel or defamation to justify censorship about things that hurt their reputations. But invoking libel or defamation requires that the speech not be true. Privacy is far more elastic, because privacy claims can be made on speech that is true.

Privacy as a justification for censorship now crops up in several different, but related, debates: le droit a l'oubli, the idea that content (especially user-generated content on social networking services) should auto-expire, the idea that data collection by companies should not be retained for longer than necessary, the idea that computers should be programmed to "forget" just like the human brain. All these are movements to censor content in the name of privacy. If there weren't serious issues on both sides of the debate, we wouldn't even be talking about this.

Most conversations about the right to oblivion mix all this stuff up. I can't imagine how to have a meaningful conversation (much less write a law) about the Right to be Oblivion without some framework to dis-entangle completely unrelated concepts, with completely unrelated implications. Here's my simple attempt to remember the different concepts some people want to forget.

1) If I post something online, should I have the right to delete it again? I think most of us agree with this, as the simplest, least controversial case. If I post a photo to my album, I should then later be able to delete it, if I have second-thoughts about it. Virtually all online services already offer this, so it's unproblematic, and this is the crux of what the French government sponsored in its recent Charter on the Droit a l'Oubli. But there's a big disconnect between a user's deleting content from his/her own site, and whether the user can in fact delete it from the Internet (which is what users usually want to do), more below.

2) If I post something, and someone else copies it and re-posts it on their own site, do I have the right to delete it? This is the classic real-world case. For example, let's say I regret having posted that picture of myself covered in mud, and after posting it on my own site, and then later deleting it, I discover someone else has copied it and re-posted it on their own site. Clearly, I should be able to ask the person who re-posted my picture to take it down. But if they refuse, or just don't respond, or are not find-able, what can do I do? I can pursue judicial procedures, but those are expensive and time-consuming. I can go directly to the platform hosting the content, and if the content violates their terms of service or obviously violates the law, I can ask them to take it down. But practically, if I ask a platform to delete a picture of me from someone else's album, without the album owner's consent, and only based on my request, it puts the platform in the very difficult or impossible position of arbitrating between my privacy claim and the album owner's freedom of expression. It's also debatable whether, as a public policy matter, we want to have platforms arbitrate such dilemmas. Perhaps this is best resolved by allowing each platform to define its own policies on this, since they could legitimately go either way.

3) If someone else posts something about me, should I have a right to delete it? Virtually all of us would agree that this raises difficult issues of conflict between freedom of expression and privacy. Traditional law has mechanisms, like defamation and libel law, to allow a person to seek redress against someone who publishes untrue information about him. Granted, the mechanisms are time-consuming and expensive, but the legal standards are long-standing and fairly clear. But a privacy claim is not based on untruth. I cannot see how such a right could be introduced without severely infringing on freedom of speech. This is why I think privacy is the new black in censorship fashion.

4) The Internet platforms that are used to host and transmit information all collect traces, some of which are PII, or partially PII. Should such platforms be under an obligation to delete or anonymize those traces after a certain period of time? and if so, after how long? and for what reasons can such traces be retained and processed? This is a much-debated topic, e.g., the cookies debate, or the logs debate, the data retention debate, all of which are also part of the Droit a l'Oubli debate, but they completely different than the categories above, since they focus on the platform's traffic data, rather than the user's content. I think existing law deals with this well, if ambiguously, by permitting such retention "as long as necessary" for "legitimate purposes". Hyper-specific regulation just doesn't work, since the cases are simply too varied.

5) Should the Internet just learn to "forget"? Quite apart from the topics above, should content on the Internet just auto-expire? e.g., should all user posts to social networking be programmed to auto-expire? Or alternatively, to give users the right to use auto-expire settings? Philosophically, I'm in favor of giving users power over their own data, but not over someone else's data. I'd love to see a credible technical framework for auto-delete tools, but I've heard a lot of technical problems with realizing them. Engineers describe most auto-delete functionalities as 80% solutions, meaning that they never work completely. Just for the sake of debate, on one extreme, government-mandated auto-expire laws would be as sensible as burning down a library every 5 years. Even if auto-expire tools existed, they would do nothing to prevent the usual privacy problems when someone copies content from one site (with the auto-expire tool) and moves it to another (without the auto-expire function). So, in the real world, I suspect that an auto-expire functionality (regardless of whether it was optional or mandatory) would provide little real-world practical privacy protections for users, but it would result in the lose of vast amounts of data and all the benefits that data can hold.

6) Should the Internet be re-wired to be more like the human brain? This seems to be a popular theme on the privacy talk circuit. I guess this means the Internet should have gradations between memory, and sort of hazy memories, and forgetting. Well, computers don't work that way. This part of the debate is sociological and psychological, but I don't see a place for it in the world of computers. Human brains also adapt to new realities, rather well, in fact, and human brains can forget or ignore content, if the content itself continues to exist in cyberspace.

7) Who should decide what should be remembered or forgotten? For example, if German courts decide German murderers should be able to delete all references to their convictions after a certain period of time, would this German standard apply to the Web? Would it apply only to content that was new on the Web, or also to historical archives? and if it only applied to Germany, or say the .de domain, would it have any practical impact at all, since the same content would continue to exist and be findable by anyone from anywhere? Or to make it more personal, the web is littered with references to my criminal conviction in Italy, but I respect the right of journalists and others to write about it, with no illusion that I should I have a "right" to delete all references to it at some point in the future. But all of my empathy for wanting to let people edit-out some of the bad things of their past doesn't change my conviction that history should be remembered, not forgotten, even if it's painful. Culture is memory.

8) Sometimes people aren't trying to delete content, they're just trying to make it harder to find. This motivates various initiatives against search engines, for example, to delete links to legitmate web content, like newspaper articles. This isn't strictly speaking "droit a l'oubli", but it's a sort of end-run around it, by trying to make some content un-findable rather than deleted. This will surely generate legal challenges and counter-challenges before this debate is resolved.

Next time you hear someone talk about the Right to be Oblivion, ask them what exactly they mean. Foggy thinking won't get us anywhere.