icanhazpdfA while ago, I wrote about the ways that people use Twitter to share scholarly articles but one thing we didn’t cover is the use of hashtags. Most tweets are sent to share the paper that is mentioned, and so it follows that most hashtags describe a personal reaction or highlight a notable aspect of the paper. However, a question from James Hardcastle inspired us at Altmetric to look into the use of one particular hashtag – #icanhazpdf (or “I Can Haz PDF”). This hashtag indicates the someone is requesting, rather than sharing, a paper – as such, it completely changes the intent of a tweet. Because #icanhazpdf tweets also contain links to papers we’ve been tracking them for a while as a side effect of our work for publishers and institutions.

#icanhazpdf originally arose as a more efficient way for science journalists and bloggers (who generally lack institutional access to journals) to quickly obtain PDF versions of scholarly articles. The process is simple: requesters tweet a link to the paywalled article along with the #icanhazpdf hashtag. Other users then respond to the request by retrieving the PDF through their own institutional access and e-mailing the file to the requester. Once the PDF has been received, the requester deletes his or her original tweet.

Rightly or wrongly using #icanhazpdf infringes copyright, but its practice is fiercely defended by many. As such, the hashtag has been the subject of many heated online debates surrounding the legality and morality of the practice (see comments in the previous link). I won’t be commenting on the legal or ethical issues of #icanhazpdf, but I would like to point out some interesting usage pattern data from the Altmetric database.


The usage of #icanhazpdf

I took a look at #icanhazpdf data from Twitter that Altmetric collected over 12 months, between May 2012 to April 2013 (see data on figshare). The graph below (Figure 1) is a timeline showing the number of #icanhazpdf tweets per week.

Figure 1

Figure 1. #icanhazpdf requests from May 2012 to April 2013

In this data set, the number of #icanhazpdf tweets peaked at 55 a week early this year (week 38, 16th January 2013), but was only 6 immediately after Christmas last year (week 35, 26th December 2012). Although this snapshot of activity is only over 12 months it seems as if the overall usage of the hashtag is slowly increasing. Over the time period examined, we saw a total of 1314 tweets tagged with #icanhazpdf. This came to about an average of 3.6 #icanhazpdf tweets per day, and 25.3 tweets per month.

What’s interesting here is the fact that there are actually not very many Twitter requests in the grand scheme of things. Compare 1314 #icanhazpdf tweets in 1 year to the roughly 10,000 tweets with links to papers (both closed- and open-access) that are seen by Altmetric per day.


The who and where of #icanhazpdf

Who is using #icanhazpdf? I took a random sample of 100 tweeters who had used the hashtag between May 2012 and April 2013 and categorised them by occupation/role by looking at their Twitter user profiles (see anonymised data on figshare). The categories were: Academic (including scientists, post-docs, and research fellows), Business (those affiliated with a commercial organisation), Communicator (including journalists and bloggers), Community, Librarian, Public, Student (undergraduates and graduates), Teacher, and Unknown (not listed or unclear from profile). The pie chart below (Figure 2) shows the breakdown of a sample of 100 tweeters according to occupation/role.

Figure 2

Figure 2. Occupation/role breakdown out of 100 users

The chart suggests that academics (35%), students (24%), and communicators (16%) use #icanhazpdf the most. This isn’t exactly surprising, since it’s been clear from online debates and conversations that it’s these groups of people that tend to defend the usage of #icanhazpdf in spite of copyright infringement issues. Interestingly, even though the hashtag was created with communicators in mind, it appears to have been embraced by a high number of academics and students. These groups presumably have access to certain journals already, and previously might have employed more closed means of obtaining articles (e.g., e-mailing the author or colleagues from other universities). However, the fact that the #icanhazpdf request goes out to complete strangers from all over the web is probably the most appealing factor, since casting a wider net would presumably increase the likelihood of catching the desired paper.

Figure 3

Figure 3. Geographic breakdown out of 100 users

Another interesting insight from the chart is that communities and members of the general public do not frequently request papers with #icanhazpdf. One might have expected that members of the public (notably patient communities), who would only be able to read open-access journals, would be more likely to use such a method to obtain paywalled papers. A possible explanation for the low usage of the hashtag is the lack of awareness amongst members of the public. The hashtag was used in the online science journalism community before it later spread to associated academics and their respective networks. Now, the usage of #icanhazpdf appears to be growing, which is perhaps due to an increase in awareness in different online communities. As such, it will be interesting to continue to follow the hashtag’s usage.

In some regions of the world institutions may have limited access to journals, despite publisher driven initiatives like HINARI. Might there be higher usage of #icanhazpdf in third-world countries? In order to find out where #icanhazpdf requesters were based, I categorised the same 100 tweeters based on the location they listed in their Twitter profiles. The bar chart on the right (Figure 3) shows the geographic breakdown within the sample. Nearly half of the #icanhazpdf requesters originated from the US, but a large number also came from Great Britain. The low levels of #icanhazpdf usage in other countries could be due to a variety of factors, including, again, lack of awareness of the hashtag.


The value of qualitative data

If tweets are tagged with #icanhazpdf, then what are the implications for altmetrics and (of particular interest to us) the Altmetric score? #icanhazpdf tweets could potentially complicate new metrics that characterise tweets too broadly – “sharing” tweets certainly signal something different to others that are effectively saying “I haven’t read it, but want to”.

Altmetric LolcatHowever, if you step away from reputation metrics and think in terms of attention instead, then the bias of #icanhazpdf doesn’t matter as much. The act of requesting a PDF still reflects attention (just as sharing a link to an abstract would). Attention is what the Altmetric score is meant to gauge: irrespective of whether the intent is to share or to receive, Altmetric treats an article mention of any kind as a signal of attention.

Since the average daily number of #icanhazpdf tweets is low (3.6 per day according to data from the past year), I would argue that the potential effects of #icanhazpdf tweets on altmetrics data isn’t a huge concern for the vast majority of papers. For qualitative assessments, it’s easy to view the Twitter conversations themselves within article details pages. As always, instead of relying on the numbers alone (e.g., “12 tweeters” for a single article) it’s important to review the qualitative data and adjust impressions about uptake and impact accordingly.

The effects of #icanhazpdf on altmetrics are arguably negligible at the present time, but its mere existence opens up interesting questions about research uptake. For example, is asking for a paper a more certain sign of uptake than sharing the article to people who lack access to the journal? It may just be a simple tag, but it certainly adds another layer of complexity to altmetrics.

A sincere thank-you goes to Bora Zivkovic, blogs editor for Scientific American, who shared many valuable insights with me over a phone conversation, and HT to James Hardcastle for the inspiration.