Friday, May 08, 2009

Privacy in social networks

Like so many other people using the internet, I use several social networks (LinkedIn, facebook, MySpace, Twitter, flickr), which means that I have a social profile on the internet. My social profile on the internet is fairly public, but many people prefer to make themselves more anonymous while using these social networks (through the options provided by these social networks).

Many people, myself included, have had some doubts about whether it was really possible to have privacy while being on a social network. We no longer need to doubt this - we can now say for sure that it doesn't really know. A recent study which will be presented at IEEE Security & Privacy '09 has demonstrated that it's possible to de-anonymize social networks by using the data that the networks sell to advertisers and make public on the internet.

The study is described in Technology review

Unmasking Social-Network Users

Researchers find a way to identify individuals in supposedly anonymous social-network data.

What was studied was whether it was possible to use the data that social networks sell, where personal identifiers have been removed, together with public available data (accessible from the internet), and connect the sold data to actual people.

The actual study is available on the internet here: De-anonymizing Social Networks.

As the abstracts clearly states, the experiment was fairly successful.

Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc.

We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.

Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy "sybil" nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary's auxiliary information is small.

The results are worrisome even for people like me who is fairly public on the internet. Since these data are sold to advertisers, it means that the social networks unwittingly provides them with personal information about me and my friends, even if they explicitly say that they won't do that.

The good things about this study is that the problem is now out in the open, and that there now is a framework for testing the privacy of social networks.

Labels: , ,


Post a Comment

<< Home