At a recent conference workshop on altmetrics tools, librarians were compared to drug dealers: getting researchers hooked on tools to help them get the most out of their research. As the momentum around altmetrics in higher education gathers pace and Altmetric for Institutions is increasingly adopted by research institutions, there are huge opportunities for librarians to position themselves as a central support team – albeit promoting good scholarly communications practices rather than drugs – and leading the way to integrate altmetrics in institutions.

I joined the Altmetric team last month as the Training and Implementation Manager, having previously led the Research Support Services team at The London School of Economics Library. Part of our offering included providing altmetrics and bibliometrics support, and being the team researchers and professional departments could look to for advice, analysis and training on emerging metrics. But developing an altmetrics service doesn’t happen overnight, and there are various ways to get out there and be an expert in the field.

So, as I take up my new role and start planning how we can extend our education and outreach support, this seems like an ideal opportunity to introduce myself (hello!) and suggest ten ways librarians can support altmetrics:

1.   Be an expert

Get familiar with the practical applications of altmetrics for researchers and professional staff in institutions. Learn about monitoring altmetrics attention to new papers, embedding on a CV and demonstrating societal impact. Follow the #altmetrics discussion on Twitter and learn how other academic librarians are supporting altmetrics in their institutions. How can research administrators, communications teams and students also make use of altmetrics? Get to know the tools available and reach out to offer support. Researchers will look to you for advice about altmetrics if you’re able to demonstrate that the Library is the place to go with questions!

2.   Embed altmetrics training in existing programmes

Run altmetrics training as part of PhD development programmes, digital literacy sessions and talk about altmetrics for researchers at faculty away days or departmental meetings. It’s useful to set the scene by offering some background on the altmetrics movement, how they work alongside existing metrics and demonstrating the practical uses.


3.   Add article level metrics to your institutional repository or resource discovery system

We offer free Altmetric badge embeds for institutional repositories and resource discovery systems, it’s just a line of code and all the details can be found here. We’re increasingly seeing Altmetric badges in resource discovery systems such as Summon and Primo – see our blog post on this for more details.

4.   Provide altmetrics advice alongside traditional bibliometric analysis

Altmetrics help complement existing bibliometric indicators. Have a read around the correlation between altmetrics and citation counts. If you’re already providing bibliometrics advice and benchmarking reports for researchers, add altmetrics to this analysis to demonstrate how the data offers a broader view of attention to research. Talk openly about the limitations of using a single score to assess research papers, encourage users to dig deeper to the the original mentions.

5.   Encourage researchers to adopt open practices

You’re probably already out there talking to researchers about open access and research data management. Include altmetrics in these conversations – for example when discussing the advantages of making papers open access, explain the opportunities for tracking the subsequent altmetrics. You could also mention the potential for tracking the online attention to open access research data. Altmetrics are another great reason for researchers to be open and can help demonstrate what’s in it for them.

6.   Support researchers using altmetrics in grant applications

detailsTalk to researchers and research offices at your institution about adding altmetrics to grant applications and funder reports. Take a look at the Score tab in Altmetric details pages, which puts the Altmetric score of attention in context by ranking against papers of a similar age or other papers in that journal. Reporting back to a funder that a paper is top ranked in that journal according to altmetrics attention is much more meaningful than citing a single number.

7.   Inform collection development

Use altmetrics data to identify emerging research interests in your institution to inform purchasing or renewal decisions. For example, identify papers with high altmetrics attention in a particular field of research and check those are part of your library holdings.

8.   Identify collaboration opportunities

Support researchers in finding potential collaborators by identifying researchers in a particular field with high altmetrics attention. Do this by running a keyword search in the Altmetric Explorer for a specific discipline and identifying authors with a lot of mentions and engagement surrounding their work.

9.   Demonstrate altmetrics uses across disciplines

Altmetrics aren’t just for STEM subjects. Talk to humanities and social science departments during liaison meetings about the opportunities for using altmetrics data to track different types of attention to their outputs. For example, we track attention to scholarly papers in policy documents – valuable for social science researchers in demonstrating real world impact.

10.   Get in touch!

We offer free access to the Altmetric Explorer for librarians – contact us for more information and sign up for a webinar to find out more about Altmetric for Institutions!


It’s starting to feel Christmassy at Altmetric HQ, and we’ve been busy pulling together our annual list of which published research has been attracting the most attention online in 2014.

Based on data we’ve collated over the year, the list takes into account all mentions and shares of articles published from November 2013 onwards in mainstream and social media, blogs, post-publication peer-review forums, bookmarking sites and platforms such as Reddit and YouTube.

Access the full Top 100 list

We’ve excluded editorial, comment and review content, as we wanted to focus specifically on original research. The data can be filtered by discipline, journal, institution, country, and access type – and you can click through to view the Altmetric details page will all of the original mentions and shares for each article.

Screen Shot 2014-12-08 at 14.54.13

The 2014 Top 100 in summary

  • The top scoring article (at the point of data collection on the 14th of November 2014), published in PNAS in June 2014, was “Experimental evidence of massive-scale emotional contagion through social networks”.
  • 51 of the articles in the list have been published since the beginning of May this year (the others were published between November 2013 and the end of April 2014)
  • 37 of the articles in the top 100 were published as open access (63 were published under the paywall/subscription model, although some have now been made free)
  • 16 of the Top 100 were published in Nature, 11 in Science, 9 in PloS ONE, 8 in PNAS, and 4 in JAMA
  • 68 of the Top 100 had authors from the United States, 19 had authors from the UK, 10 from Canada, 11 from Germany (the most in Europe), 4 from China, and 9 from South or Central America
  • Top performing institutions include Harvard, Harvard Medical School, and Harvard School of Public Health – whose authors featured on 15 articles in total. Authors from institutions that are part of the University of California System featured on 10 articles. In the UK 3 articles featured authors from the University of Cambridge, and 3 from the University of Oxford.

A closer look
As we’d expect to see, a large proportion of the list reflects research that dominated the mainstream media agenda throughout the year – for example studies on Ebola, a new black hole theory from Stephen Hawking, and research which resulted in the manipulation of Facebook users timelines all rank highly.

It’s often the studies which have relevance or can be made easily accessible to a wide audience that receive a lot of coverage, and therefore is no surprise that studies which fall under medical and health sciences make up 44 of the 100 articles featured.

Geographically, authors from the US, Europe or UK are present on over 45% of articles which made the list (this is partly due to some of our coverage – there is a bias). Over 80% of articles which named a UK author were the result of an international collaboration, whilst in contrast just 46% of articles authored by US researchers featured input from overseas researchers.












Screen Shot 2014-12-08 at 14.55.43

Many people have questioned whether or not Open Access articles are more likely to get shared and discussed than those that are published behind a paywall – indeed we recently undertook a small scale study of our own to look at this. At the time the results we got proved positive, however we do not see the same trend reflected in this list; just 37 of the 100 articles featured were published under an Open Access license.

(An addendum here – in a comment on a recent Scholarly Kitchen post, it was rightly pointed out that due to the much smaller proportion of papers published open access overall, a 37% share in our list actually does represent a higher proportion of OA articles being shared – tying in with what we identified in our original study.)

This may suggest that those sharing these articles do not stop to consider whether or not their peers will be able to read the material, or it may be that the mainstream news agenda is driven to some extent by what journal publishers or institutions choose to highlight in through their press efforts.

It’s worth mentioning again that this list is of course in no way a measure of quality of the research, or of the researcher. Studies of the life span of chocolate on hospital wards, and an effort to search the internet for evidence of time travellers both feature high in the list this year – entertaining content which provides a bit of light relief and is quickly distributed. Similarly, a case of unfortunate author error (which we discussed in more detail here) is still generating new attention online now, months after publication and weeks after the error was spotted and rectified.

Nonetheless we hope that this breakdown and the Altmetric data available for each article will offer some insight into which research has captured the public imagination this year, and why that might have been.


I agree with lots on the excellent ImpactStory blog but I really don’t agree with this post arguing that Nature’s new SciShare experiment is bad for altmetrics. It really isn’t. I figured it was worth a post here to explain my thinking.

In my view it’s mildly inconvenient for altmetrics vendors. :) But you can’t really spin it as “bad” in this context beyond that and given that there are good aspects too I think the title of the ImpactStory post is overkill.

I may be biased. We share an office in London with Nature Publishing Group and Digital Science (their sister company) who invested in Altmetric. I also used to work for NPG and have a lot of love for them. I have a lot of respect for Readcube too, who’ve done some awesome things on the technical side. So bear that in mind.

Anyway, I actually agree to an extent with what are maybe the two main points from Stacy and Jason’s post and those are the ones I’ll cover in a bit more detail later.

Here for reference is the list of recommendations that they make:

[NPG should...]

  • Open up their pageview metrics via API to make it easier for researchers to reuse their impact metrics however they want
  • Release ReadCube resolution, referral traffic and annotation metrics via API, adding new metrics that can tell us more about how content is being shared and what readers have to say about articles
  • Add more context to the altmetrics data they display, so viewers have a better sense of what the numbers actually mean
  • Do away with hashed URLs and link shorteners, especially the latter which make it difficult to track all mentions of an article on social media

First off I think the lack of pageviews API on and the look of the altmetrics widget on the ReadCube viewer sidebar are sort of irrelevant – sure, those points are definitely worth making, but these aren’t SciShare things, or even issues unique to Nature.

That same widget has been in the ReadCube HTML viewer (which hundreds of thousands of Google search users have seen on a regular basis) for years and to be fair you’ve got to admit that almost nobody – except for PLoS and eLife AFAIK – have an open pageviews API.

Leaving aside how useful or not pageviews actually are for most altmetrics use cases (I actually have problems with them, as they’re neither transparent nor particularly easy to extract meaning from) I’d love for there to be more APIs available so tool makers had options… but yeah, not really anything to do with SciShare per se.

The final recommendation contains the bits I agree with and it’s worth diving into.

The sharing URL doesn’t include a DOI or other permanent identifier

I’d definitely agree that it’d be useful for the link to include an identifier. It saves us work. That said, lots of publishers (the majority, even) have links without DOIs in them and we have to work round it. It’s not a big deal.

Some hard numbers from just us to back this up – I imagine other providers see similar ratios: there are 2,714,864 articles mentioned at least once in the Altmetric database.

Of them only 813,024 (~ 30%) had a DOI in at least one of the links used by people in a mention (this number will also include people who used links rather than links to the publisher’s platform).

The URLs may break

Exact same deal here: I agree, it’d be nice to encourage the use of a more ‘permanent’ link, and it’d definitely be good to hear somebody clarify what’ll happen to the links after the experiment is over. I’m surprised somebody hasn’t already (update: I should have checked Twitter first, Tom Scott at NPG has said they will be persistent).

But… for whatever reason only a very small fraction of users on social media use links.

We have 11,088,388 tweets that mention a recognized item in the database.

Only 25,132 (0.2%) of them contain a or link (this actually really surprised me, I thought the figure would be more like 10%, but there you go).

You could say that SciShare doesn’t help these problems, and you’d be right. It’s really not going to make them noticeably worse though. I think altmetrics has bigger problems with links to content.

Non-SciShare problem A: News outlets not linking to content at all

I didn’t have any inside track on SciShare and we weren’t involved in any of the planning, but I did hear about it a little early when we got asked to help identify news and blog sources for whitelisting purposes (I don’t know if the data we put together is what eventually got used, or if some extra editing was involved).

My first thought was: if it means a single news source starts actually linking to content instead of just mentioning the journal in passing then it’ll probably be worthwhile.

The biggest problem with news outlets and altmetrics is that even when they’re covering a specific article they usually don’t link to it (happily there are a growing number of exceptions, but they’re still just that, exceptions). Publishers usually blame journalists, and journalists blame publishers or the fact that they’re writing for print and online is an afterthought.

We end up having to rely on text mining to track mentions in news sources which works but means we have to balance precision and recall. Anything that helps supplement this with links and make mainstream media data more reliable sounds good to me.

Non-SciShare problem B: Lack of machine readable metadata

This topic came up a lot at the PLoS almetrics workshop last week.

I’d argue that the single biggest problem for altmetrics vendors when it comes to collecting data is actually insufficient machine readable metadata – and especially identifiers – on digital objects containing or representing scholarly outputs, especially in PDFs.

Incidentally NPG has actually always been a leader here. If you curl a SciShare URL you’ll notice machine readable metadata including the DOI come back.

Unfortunately it’s not a particularly interesting issue unless you like scholarly metadata. It doesn’t have an exciting open access angle and there’s unfortunately not a hashtag to use to campaign for it, but there probably should be.

We often get asked during demos, webinars and conference sessions for more detail on the news sources we track for mentions of academic research – and in particular how we do it, and how global our coverage is.  

So firstly, what do we track, and why?
We see mentions in mainstream media and news as a crucial part of the wider engagement a piece of research achieves. If an article, book, or dataset is picked up and getting a lot of coverage amongst mainstream media outlets, chances are it has a notable societal significance and the potential to generate further discussion or study.

Being able to provide this data helps researchers:

  • easily see and report on how their work has been communicated by the press to the general public
  • get ideas for new outlets to engage with in future
  • identify where a new approach might be needed in order to ensure their work is fairly and accurately reported

To this end, we maintain a curated list of over 1,000 news outlets, which we are adding to every week. These come from all corners of the globe, and in many different languages. Our current regional coverage is represented here:



You can see a full list of all the news outlets we track on the website.

How do we do it?
We combine a mixture of automated scanning for links and text mining to try and ensure our news coverage is as comprehensive as possible. As long as the news outlet and domain of the content it refers to (for example The Washington Post and an article on are on our whitelist, we’ll pick up the mention.

You can read more about the technical detail in this blog post contributed by one of our developers earlier this year, and this one from our Product Development Manager which gives a little more detail on the text mining.

Any mentions we find relating to a particular research output are then displayed on the ‘news’ tab of the Altmetric details page for that piece of research – you’ll see the title of the news piece that mentioned the research and be able to click through to read the original news article in full.

Want to suggest news sources we might not be tracking?
Tell us about it! Please make sure it’s news and not personal opinion, and has a working RSS feed – we’ll add it as long as all checks out ok.

Identifying the right literature to spend time reading has long been a challenge for researchers – often it is driven by table of contents alerts sent straight to an inbox, or a recommendation from a superior or colleague. Libraries have invested in systems to make the most relevant content easily accessible and above all, easily discoverable. But a search in a discovery platform can draw hundreds of results, and it is sometimes difficult just from those to make an informed decision about what might be worth digging further in to.

Screen Shot 2014-11-20 at 14.49.24This is where altmetrics might be able to help. Including the Altmetric badges and data for an article within a discovery platform makes it easy for a researcher to determine which of those articles have been generating a buzz or picking up a lot of attention online, and with just a few clicks they can view the full Altmetric details page to identify if the attention is coming from news articles, blogs, policy makers, or being shared a lot on a social network such as Twitter or Facebook.

But it’s not just about what’s popular – it’s about context: this level of detail makes it easy to understand who is talking about the research and what they thought of it. Insight such as this may be particularly useful for younger researchers who are still building their discipline knowledge and looking for new collaborators and wider reading material.

At Altmetric we’re already supporting the implementation of our data and badges in platforms such as Primo (from ExLibris) and Summon (from ProQuest).


There’s a free plugin which can be added to any Primo instance. Anybody can download primoand install it, enabling  their users to see scores and mentions for any articles matched in the system via a new “metrics” tab on the item details page.

Clicking through on the donut brings you to the Altmetric ‘details page’, which displays the original mentions for the article. If you get in touch we can open up the data so that your users can see all of the mentions from each source – otherwise they’ll see just 3 of each type.

You can find the documentation that details the long and short form badge embeds on the Primo Developer Network . Here’s an example of an implementation at Wageningen UR:



Summon-logo-withtextSummon clients using a custom interface (like Heidelberg and University of Toronto) can easily integrate the Altmetric badges themselves .

You’ll need to use the JSON API, and as long as the results have identifiers (such as a DOI or PubMed ID) you’ll be able to display the altmetrics data for your articles.

And again, please do let us know once you’ve got them up and running so that we can ensure your users can click through and see all of the mentions for each article (not just the first 3 from each source).


If you’re running another discovery service and would like to find out if you can integrate the Altmetric badges, please drop us a line and we’ll see what we can do to help.

Articles and other research outputs don’t always get attention for the reasons we might first assume. There’s a reason you shouldn’t ever rely on numbers alone…

This was demonstrated in spectacular form once again this week when the Twittersphere jumped on a recent article that contained a rather unfortunate error – an offhand author comment asking “should we cite the crappy Gabor paper here”?

The article got a lot of attention – it is now one of the most popular items we’ve picked up mentions for this week (here’s another), rocketing to near the top of the rankings for the journal as the error was shared.

Indicators like the attention score we use reflect the fact that lots of people were talking about the article but not that the attention was, and here we’re just guessing, probably unwanted.

This isn’t the first time we’ve seen cases like this. As you would expect articles get attention for all sorts of reasons which aren’t just to do with the quality of the research.

A few favourite examples we’ve come across over the years include this paper authored by a Mr Taco B. Monster – currently claiming an Altmetric score of 485, with almost 600 mentions to date, and also brought to our attention this week was the tale of the disappearing teaspoons – which is still causing quite a stir ten years after it was first published:


Flawed Science
A more serious example of attracting attention for all the wrong reasons which comes to mind is in relation to this article published in Science in 2011. The researchers suggested that a type of bacteria could use arsenic, as opposed to the phosphorus used by all other life on the planet, to generate DNA. The article initially received a huge amount of press attention but other scientists quickly pointed out errors – you can dive into some of the relevant mentions by looking at the Altmetric details page


Similarly, a suggestion that neutrinos may have been measured as travelling faster than the speed of light did not stand up to further scrutiny, although the truth was only uncovered months later following numerous (successful, but flawed) re-tests.



Amongst the blogs, news outlets, general public and other scientists questioning the results coming out of CERN, this article, published just weeks after the original data was made available, generated some impressive altmetrics of its own, most likely due to its humorous abstract. 


Playing politics
Typically we’ll also see a high volume of attention around research that is particularly topical or controversial at the time. An article published in the Lancet this year which examined the privatisation of the NHS in Scotland with relation to a yes or a no vote in the recent referendum received a very high volume of tweets as those in the ‘yes’ campaign shared it to encourage their followers to vote in favour of independence:

We’ll be releasing our Top 100 most mentioned articles for 2014 in a couple of weeks (you can see the results for 2013 here) – it’ll be interesting to explore why and how those that make the list caught the public and academic imagination this year.

One of the things that appealed to me when I joined Altmetric recently was the distinctive visual ‘donut’ that illustrates the various different sources of attention that an article has attracted.

Introducing Altmetric's new bar visualisation.

Introducing Altmetric’s new bar visualisation.

I really like how the donut’s fixed number of slices forces the eye to appreciate the approximate proportions of an article’s sources. Any visualisation that is more precise, such as a more conventional pie chart, tempts us to look too closely at proportions of one source against another, as well as potentially allowing one particular source which has generated loads of mentions to completely overshadow the others.

I happen to think that including lots of donuts on a single page can look pretty awesome, especially in views such as Altmetric Explorer’s Tiled mode. But when including badges on your own site, it may be the case that what works for our site isn’t quite right for your own.

That’s partly the reason why we’ve always offered a variety of badge styles – donuts in three different sizes, along with smaller badges that contain a simpler button and score. And for most uses of our embeddable badges, I think the range of sizes and popover options should enable you to get badged up really easily and effectively.

Much as I love the donut, though, it may not be appropriate in every situation – but the smaller buttons with just an Altmetric score lack that proportional, at-a-glance view that makes the donuts so appealing.

So now, we offer another visualisation type for your site: the bar. It’s got the same colour scheme as the donut, in the shape of a horizontal strip. Think of it as the cruller to our usual ring donut.

Available in three fixed sizes, bar badges work best when space is at a premium, such as within tabular lists of articles where even the smallest donut would make each row of the table far too deep. We’re using such an arrangement in the summary report pages within our Altmetric for Institutions pages.

The bar visualisation as it appears in Altmetric for Institutions.

The bar visualisation as it appears in Altmetric for Institutions.


As you can see from the screenshot above, the bars’ scoreless display emphasises the proportionate attention each paper has been receiving. You can provide more information even at this level by using our optional popovers with statistical breakdowns – and if your table data clicks through to more details about an article, you can of course continue to use the traditional donut on those pages.


Installing the bar visualisation on your pages

Badge builder

The interactive badge builder on our embeddable badges documentation.

The bars are available right now – all you need to do is specify bar, medium-bar or large-bar as the badge style in your embed code. If you head to our embeddable badges documentation page, you can see for yourself in the interactive badge builder.

If you’re using our badges on your site already, you can put the new designs to use straight away. If you’re not using our badges yet, now’s the perfect time to try, and it’s really easy – see our documentation for a step-by-step guide.

Let us know how you get on with the new bars, as well as our other badge styles, and send us examples of how you’re including them on your website – we love to see how people are putting Altmetric’s information to use!

You might have seen the article published recently in Nature which look at the top 100 most highly cited papers from 1900 onwards, based on data from the Thomson Reuters Web of Science Database.

The article highlighted that it is (perhaps unsurprisingly) much older articles that have accrued the majority of citations to date and therefore dominate the list – with more recent breakthroughs and nobel-prize winning advances struggling to compete with the 12,119 citations it would take to rank in the top 100.

So we were curious to see what other kind of attention these articles might have been receiving in recent years. At Altmetric our data goes back reliably until November 2012 – meaning that we have been tracking our sources for any mentions of those articles since then. A search in the Altmetric Explorer tells us that since November 2012 we have seen 287 mentions in total for the 52 of the 84 articles in the top 100 list that were listed with a DOI or other unique identifier.

The oldest paper from the list that our database contains a mention of is The attractions of proteins for small molecules and ions, published in 1949 in the Annals of the New York Academy of Sciences. The (joint) 3rd oldest article in the top 100 list, in April 2013 we picked up a mention of the paper from a Japanese researcher on Twitter:

The tweet went on to be favourited by 2 other Twitter users, both researchers themselves; an interesting example of how core literature is being shared amongst peers online, even decades after publication.

Of the 52 articles we had picked up mentions for, 5 had been mentioned in mainstream news outlets in the last 2 years:

Electric Field Effect in Atomically Thin Carbon Films


Improved patch-clamp techniques for high-resolution current recording from cells and cell-free membrane patches
Pflügers Archiv


van der Waals Volumes and Radii
The Journal of Physical Chemistry


Clinical diagnosis of Alzheimer’s disease


Continuous cultures of fused cells secreting antibody of predefined specificity



 Just one article in the set, A rating scale for depression, had been referenced in a policy document source we track: Lithium or an atypical antipsychotic drug in the management of treatment-resistant depression: a systematic review and economic evaluation - part of the NICE Evidence Search Collection. Our policy documents sources are expanding every week so this statistic may change over time as our coverage grows.

It’s interesting to see that many of these older articles, quite apart from just citations, are still generating attention online. The original data is available below if you’d like to take a look at all of the mentions we’ve seen for each article.

Our full dataset for mentions of these articles is available here.

And the original Nature article can be found here.

Ahead of the recent 1:AM altmetrics conference we ran a hack day at the Macmillan offices in King’s Cross. Lots of exciting ideas and developments came out of the day, and there was one in particular we wanted to share…

Altmetric have been a supporting member of ORCID, an organization which enables researchers to create a unique identifier for themselves that they can then associate with all of their research outputs, since early 2013.

It’s always been possible for our  database to capture and store the journal and publisher information for the 2.5 million+ published articles and datasets we’ve seen mentions of online in the last few years, but matching those outputs back to an author represents more of a challenge.

Whether it is a researcher with the same name as another, a different spelling or use of just an initial and surname, or a change of family name, it can be very hard to generate a consistent and reliable record of scholarship. ORCID offers a solution for many of these roadblocks,and it’s adoption is being encouraged globally by institutions, funders and publishers.

We were therefore very excited when one of the teams at the hack day decided to focus their efforts on building a tool that brings the Altmetric data and ORCID IDs together – meaning you could easily find and browse the altmetrics data for any output that was associated with a specific ORCID ID (i.e. a specific researcher).

A test version of what was built can be found here; feel free to try it out! Just enter in the ORCID ID for any researcher and (complete with magical spinning donut) you’ll get back all of the Altmetric data and a breakdown of the mentions for all of that authors output. It should run without too many issues but please do bear in mind that this was built in a day and hasn’t been rigorously tested – be gentle.

We’ve used our Product Development Manager Jean’s ORCID ID in this example:











We’d love to know what you think and will be looking to build ORCID integration further into Altmetric tools in future.



You can get the dataset from and the PDF of this post here:
Adie, Euan (2014): Attention! A study of open access vs non-open access articles. figshare.

There are lots of good reasons to publish in open access journals. Two of the most commonly given ones are the beliefs that OA articles are read more widely and that they generate higher citations (for more on this check out slide 5 of Macmillan’s Author Insights Survey, which is up on figshare).

Do open access articles get higher altmetric counts?

In celebration of Open Access week we decided we’d take a look at some hybrid journals to see if there was any discernible difference in the quantitative altmetrics between their open access and reader pays articles. We picked Nature Communications to look at first as it’s a relatively high volume, multi-disciplinary-within-STM hybrid journal (at least it was during our study period – it has gone fully OA now), selects articles for publication blind to OA / non-OA status and clearly marks up authors, license and subject areas in its metadata. Plus we sit in the same building.

Coincidentally Nature Publishing Group recently commissioned a study from RIN that indicates that the OA articles in Communications get downloaded more often than their reader pays counterparts. So does that hold true when looking at other altmetrics sources?

Prepping the data & first impressions

Using a combination of the Altmetric API and web scraping we pulled together data on all the Communications papers published between 1st October 2013 and 21st October 2014. You can find all of it on figshare.

The short answer is that yes, there does seem to be a significant difference in the attention received. We’re going to cover some of the highlights below, but feel free to take the dataset and delve deeper – there’s only so much we can cover in a blog post.

First let’s characterize the dataset. It contains 2,012 articles of which 1,395 (70%) are reader pays. The bulk of articles – 1,181 (59%) – are tagged ‘Biological sciences’ by the journal. 519 (26%) are ‘Physical sciences’, 193 (10%) ‘Chemical sciences’ and 104 (5%) ‘Earth sciences’. Only 4 of the 2,012 are reviews.

We grouped articles by month of publication so that we can control for the fact that some kinds of altmetric data accrues over time. You can see this clearly in the graph below – the median number of Mendeley readers for articles published in each month is the line in red.

Screen Shot 2014-10-23 at 17.28.28

A tangent: every source is different

“Older articles have more” doesn’t hold true for all sources. I’ve plotted the median number of unique Twitter accounts talking about each paper by month of publication above too, in blue. Notice that the median actually trends down very slightly as we look at older papers.

This is because: (1) most tweeting happens very quickly after publication and (2) the Twitter userbase is growing incredibly rapidly so there are more people tweeting papers each month.

Think about it this way: if you compared a paper published in 2009 to a paper published in 2014, the 2009 paper would have lots of citations (accrued over time) and hardly any tweets (as not many researchers were tweeting when it was first published – Twitter was still very new). The 2014 paper would have hardly any citations but lots of tweets (as there is now a large number of tweeting researchers).

This is sometimes addressed in novel ways in altmetrics research: Mike Thelwall’s paper in PLoS One presents one elegant solution to a similar issue.

An initial hypothesis

Let’s get back to OA vs reader pays. Here in the office our initial hypothesis was that there would be an OA advantage for tweets in general as a larger audience would be more inclined to read and tweet the paper, but that the effect would be much less pronounced in Mendeley readership and amongst people who regularly tweet scientific papers.

Here’s the median number of tweeters over time, comparing the two cohorts in each month of publication:

Screen Shot 2014-10-23 at 17.37.57

And the median number of Mendeley readers (remember that newer articles won’t have many Mendeley readers yet):

Screen Shot 2014-10-23 at 17.38.07

To get a feel for the data we graphed means and 3rd quartiles too. Here’s the mean number of tweeters who regularly tweet scientific papers:

Screen Shot 2014-10-23 at 17.46.30

There’s a lot of light blue in these graphs and just eyeballing the data does seem to indicate an advantage for OA papers. But is it significant? Once we establish that we can start considering confounding factors.

If we look at all of the articles published in Q4 ’13 (to give ourselves a decent sized sample) we can compare the two cohorts in detail and do some sanity checking with an independent t-test. We’ll look at average author and references counts too in case they’re wildly different, which might indicate an avenue for future investigation.

Here are the results:

Screen Shot 2014-10-23 at 18.13.32

It seems like there is a difference between the number of tweets, the number of tweets by ‘frequent article tweeters’ and Mendeley readers.

The idea that the effect may be less pronounced for Mendeley doesn’t really hold water – a median of 23 readers for OA articles vs 13 for the reader pays is a pretty big difference.

Interestingly we didn’t see much difference in the number of news outlets or blogs covering papers in the two cohorts. A lot of news coverage is driven by press releases, and on the Nature side there is no preference for OA over reader pays when picking papers to press release (we checked).


If we accept that the articles published as open access did get more Twitter and Mendeley attention the next obvious question is why?

Two things to check spring immediately to mind:

  1. Do authors select open access for their ‘best’ papers, or papers they think will be of broader appeal?
  2. People tweet about life sciences papers more than they do physical sciences ones. Perhaps the OA cohort has a higher number of biomedical papers in it? Notice that the OA cohort also has more authors, on average, than reader pays cohort. Might that be an indicator of something?

Do authors select only their ‘best’ papers for open access?

It doesn’t seem like we can discount this possibility. Macmillan’s author insight survey (warning: PDF link) have 48% of scientists saying “I believe that research should be OA” as a reason to publish open access, which leaves 52% who presumably have some other reason for wanting to do so. 32% have “I am not willing to pay an APC” as a reason not to go OA. The APC for Nature Communications is $5,200.

Are the higher altmetrics counts a reflection of subject area biases?

Screen Shot 2014-10-23 at 21.21.01

There doesn’t seem to be that much difference when we look at top level subjects, though it might be worth pulling out the Earth Sciences articles for a closer look.

That said, some disciplines definitely see more activity than others: if we look only at articles with the keyword ‘Genetics’ across our entire dataset, taking the median of unique tweeters per article each month then the ‘median of medians’ for OA is 21 and 6 for reader pays.

Compare that to ‘Chemical Sciences’ where the OA median of medians is only 3, and for reader pays it’s 2.

Wrapping up

Open access articles, at least those in Nature Communications, do seem to generate significantly more tweets – including tweets from people who tweet research semi-regularly – and attract more Mendeley readers than articles that are reader pays.

It seems likely that the reasons behind this aren’t as simple as just a broader audience. We’ve also only been looking at STM content.

Would we find the same thing in other journals? We deliberately looked within a single journal to account for things like differences in how sharing buttons are presented and to control for different acceptance criteria, and the downside to this is we can’t generalise, only contribute some extra datapoints to the discussion.

We’ll leave further analysis on those fronts as an exercise to the reader. Again, all the data is up on figshare. Let us know what you find out and we’ll follow up with another blog post!