Altmetric Blog

Altmetrics, privacy and obscurity

Stacy Konkiel, 11th April 2017

Man obscures his identity by holding his hand in front of his face

Not everyone wants their discussions of research tracked / Image CC-BY-SA savageblackout/Flickr

The team here at Altmetric has been recently pondering the extent to which our status as a “Big Data” company (and that of other altmetrics aggregators, by extension) might challenge individuals’ privacy when discussing research on social media.

On the one hand, we mostly track public conversations surrounding research. On the other, many social media users feel that while others can view their conversations in theory, they are having personal conversations that are not intended for anyone but their followers to see (e.g., doctors talking amongst themselves, patient advocates recommending research articles to deal with diseases, etc). Altmetrics aggregators make those assumed-private conversations a lot easier to find.

So, I’ve been thinking about following questions:

  • Do altmetrics aggregators violate privacy when making online discussions more easy to find? Or might they do something else entirely: challenge obscurity?
  • How can “Big Data”–including altmetrics–benefit everyone, in spite of potential challenges to privacy?
  • How might Altmetric amend our own data collection policies in order to respect users’ rights over their own data?

In this post, I distill expert opinions on privacy, obscurity, and Big Data in order to find answers to these questions, and share recent updates that our company has made to our data collection practices.

Altmetric is eager to share our thinking on this important topic in order to hear your thoughts. Please have a read and leave your perspectives in the comments below.

What “privacy” means

Merriam-Webster defines privacy as “the quality or state of being apart from company or observation; freedom from unauthorized intrusion; or secrecy.”

When it comes to online privacy, one school of thought suggests that social media users expect to be able to talk freely (and publicly) with their friends, family, and colleagues without being tracked by governments or corporations. Counterintuitive though it may be, that perspective likely exists because, as Hartzog & Stutzman point out, “one conceptualization of privacy, secrecy, can be seen as antithetical to the notion of social interaction.”

However, at least in the US, no “reasonable expectation of privacy” exists for things made public or shared in a public space. More than one hundred years ago–in the pre-pre-pre-Twitter era–Brandeis and Warren opined, “The right [to privacy] is lost only when the author himself (sic) communicates his production to the public, — in other words, publishes it.”

Altmetrics services tend to respect users’ privacy

If privacy is the right to keep things shared privately a secret, then Altmetric is the altmetrics aggregator that respects individuals’ privacy the most. We do not track private profiles or posts on any social media site. (To be clear, we do track some other, aggregate metrics where they are made public in an anonymous way, e.g., Mendeley bookmarks.)

Other altmetrics services that track private interactions on social media (e.g. “likes” from non-public profiles) do so anonymously, meaning they too respect individuals’ privacy. This is by design, as the social media APIs such aggregators use do not disclose information on the identities of those who have liked or shared research. Altmetrics services can thus never expose private comments and likes–they are just reported as having happened, en masse.

If your private discussions of research are not being shared and associated publicly with you, then your privacy is being respected by all altmetrics aggregators. To date, there have been no known cases of such exposure of private information.

However, we are all guilty of something else.

Altmetrics services can obliterate obscurity

A beam of light

Altmetrics aggregators put the spotlight upon public discussions / CC-BY Blondinrkard Froberg/Flickr

As lawyer Nate Russell explains, “Compared to the classic concept of privacy as a right to be left alone, obscurity is much more like the right to be out in public and not need to hide in seclusion because what one is doing or saying is not being intelligibly processed by any discerning observer.”

Altmetrics aggregators’ entire purpose is to obliterate obscurity: we expose public discussions of research for the whole world to find!

Here’s where our purpose becomes problematic, for some:

“Though we colloquially say we socialize in “public,” in truth our personal interactions are usually enveloped in zones of obscurity, where our identity and personal context are shielded to those we interact or share common space with…For example, the mere act of disclosing information online does not necessarily mean that the individual seeks wide publicity, even if the information disclosed is theoretically available to the Internet at large. Just as an individual shouting from the street corner will only be heard by so many individuals (her audience is limited by architecture, social interaction, and pure physics), the rational online discloser has similar expectations with content shared online.” (Hartzog & Stutzman, 2013)

In other words, “Individuals want to be able to share with their friends and business associates on social media…but they don’t necessarily want all of these activities monitored, tracked, collected, and used by entities they do not know or with whom they have no relationship.”

We understand why it might be uncomfortable for some that we are tracking their conversations. But many believe that the benefits of doing so outweigh the drawbacks.

A trade-off between academics’ desire for obscurity and the benefits of Big Data

“It seems that for privacy hawks, no benefit no matter how compelling is large enough to offset privacy costs, while for data enthusiasts, privacy risks are no more than an afterthought in the pursuit of complete information.”Polonetsky & Tene, 2013

Drawbacks to altmetrics

Let’s get this out of the way: there are absolutely drawbacks to any kind of Big Data collection, both in the altmetrics realm and on other fronts.

Often, the biggest drawback is that individuals are not aware that data is being collected about them. We saw this with a recent Facebook study–where users’ reactions to posts were analyzed and reported upon without their informed consent. There’s no easy way around this for altmetrics companies, who end up collecting discussions of research from millions of users and have no practical way of contacting them all.

On the other hand, awareness that you are being tracked might lead to self-censorship. Brookman & Hans explain: “In order to remain a vibrant and innovative society, citizens need room for the expression of controversial—and occasionally wrong—ideas without worry that the ideas will be attributable to them in perpetuity.” We’d argue that quality scholarly discussion of research–especially for junior researchers–needs the same freedoms.

It’s also unclear to what extent the collection of such conversations might be bad for the authors of research. For example, could it damage someone’s career to have negative comments about a recent article collected and shared in a relatively easy to access place (e.g. searchable in an altmetrics database)? While there haven’t been any cases of this happening yet, it’s certainly possible.

Moreover, might the absence of data be bad for authors? If no one is talking about your work, does that mean that you’re not doing enough to promote it, or worse, that it’s not worthy of discussion? This dearth of discussion is especially worrisome for researchers in technical or obscure fields, who often publish research that’s invaluable but nonetheless rarely discussed.

Benefits to altmetrics

As Polonetsky & Tene explain, to begin to unpack the benefits of Big Data in an honest way, we must examine “who are the beneficiaries of big data analysis, what is the nature of the perceived benefits, and with what level of certainty can those benefits be realized.”

In their 2013 article for the Stanford Law Review, the duo give examples of who can benefit from Big Data: Individuals (perhaps via Netflix customization, which gives users a more interesting suggestions for new films to watch); the Community (which can benefit from the collective insight of thousands of browser crash reports); Organizations (who might use Big Data to optimize their methods for accepting credit card payments); and Society (potentially in the development of fraud detection algorithms for banking).

When examined through an altmetrics lens, the list of beneficiaries might look something like this:

  • Individual scholars (who can use altmetrics data for their own research to apply for grants, to more easily connect with the public, or for other professional advancement strategies);
  • The Scholarly Community (who can use altmetrics for others’ research to identify “trending” articles and areas of research);
  • Organizations (which can use altmetrics to discover as-yet-unrewarded research being done in their institution); and
  • Society (of which certain sectors like policymakers and the government might use altmetrics to understand the impacts of the research they fund)

Polonetsky & Tene provocatively suggest that the benefits of Big Data can in some cases transcend privacy law (for example, when privacy must be breached to help law enforcement solve serious crimes). We’re not so sure that this argument for upending privacy applies to the realm of altmetrics. That is why we’d never expose private discussions of research–there just isn’t a large enough benefit to all to be had.

We do think that the benefits of altmetrics transcend the argument for obscurity. For example, we believe that there’s a lot of value in knowing that federally funded public health research is getting the attention of the public, and even making a change in their daily lives. In that way, it’s useful to expose the conversations around research, even if individuals have not explicitly granted companies like ours permission to track their publicly available conversations.

This argument leads us to Polonetsky & Tene’s final point: that one must be certain that a benefit is being achieved in order to justify the loss of privacy (or, in altmetrics’ case, obscurity). To be honest, the jury is still out. At best, the evidence we have on altmetrics’ benefits overcoming privacy tradeoffs is “anecdata”–a growing number of individual success stories, but nothing that proves that altmetrics will help everyone, all the time. However, we wholeheartedly believe that we’ll be able to prove this in time, as more research is done. Otherwise, we wouldn’t be in the business we are in.

What can individuals do to protect their conversations in the altmetrics era?

We’re big believers in individuals’ rights over how their own data is used by others. That’s why we suggest that those concerned about their conversations being tracked take steps to “obscure” themselves from altmetrics services.

One way to obscure yourself is to adopt a pseudonym for your online persona (like Neuroskeptic does), which some argue makes for a fairer system of open peer review and criticism in academia.

You can also make private or delete any social media posts or profiles that you do not want associated with a piece of research. More on that below.

How Altmetric approaches privacy and obscurity

It’s not only up to individuals to protect individuals’ right to obscurity and privacy online. We think the businesses that benefit from collecting Big Data have a responsibility, too. That’s why we have recently changed our data collection and retention policies.

We want to respect the wishes of those who don’t want to be tracked. But operationalizing that is hard. With millions of social media users discussing research every day, how can we possibly know who wants to be tracked and who doesn’t?

To solve this problem, we took an approach that’s similar to the Katz test (a legal precedent for determining to what extent one’s privacy should be respected): “The person’s precautions taken to exclude others’ access are strong indicators to the expectation of privacy and might be taken into consideration by the court.” (Emphasis ours.)

For some time, our systems design has reflected this principle by only indexing public social media posts, or anonymous, aggregated metrics from other systems like Mendeley.

We’ve also recently updated our data collection policies thusly: individuals who do not wish for a post to be associated with a research output in our systems can make their post private or delete it altogether. Same goes for profiles: simply make your social media profile private and we’ll remove your posts from our database.  We’ve already got systems in place that make it quick and easy to remove private and deleted posts from Altmetric details pages. By encoding our respect for users’ privacy in our policies and informing users that these policies exist, we believe we’re helping to protect users’ privacy much better.

What do you think about privacy, obscurity, and altmetrics? How do you think altmetrics services should (or should not) change their services to reflect individuals’ expectations of privacy or obscurity? Do you think altmetrics are beneficial enough that the tradeoffs are worth it? Leave your thoughts in the comments below, or share them on Twitter with the hashtag #altprivacy.

Leave a Reply

Your email address will not be published. Required fields are marked *