Stacy Konkiel, Director of Research Relations at Altmetric, breaks down five of the most useful and interesting tidbits about Altmetric data.
Though Altmetric’s name is practically synonymous with altmetrics as a larger field of practice, there are a lot of things the larger community may not know about our data and how we track it!
We’ve collected 95.8 million mentions of 12.4 million works to date, across 17 distinct types of data sources.
Altmetric tracks research in all its diverse formats: books, articles, data sets, clinical trials, book chapters, news stories, websites, blog posts, reports, white papers, and virtually any other format in which scholarship can be shared online.
Here’s how it works: we don’t have to be “told” (i.e. by a customer) that a piece of research exists in order to start tracking it. Instead, we watch certain data sources like Twitter and news sites for shared links to particular domains that we’ve whitelisted (e.g. “nature.com”), then follow the links and check the webpage metadata for information in order to verify that what’s being shared is actually a research output.
That’s where persistent identifiers come in–along with the first thing you may not know about Altmetric’s altmetrics.
1. Diverse outputs require diverse persistent identifiers
Across the 23.2 million research outputs Altmetric tracks, we recognize 13 different persistent identifiers and we use them to verify that any link that’s shared in a source we track actually points to a research object (and not to journal author guidelines of publisher privacy policies, for example).
Though digital object identifiers (DOIs) are assigned to the vast majority of the outputs that we track, we also use other kinds of identifiers like ISBNs to track books or Handles to track repository content.
Table 1 contains a breakdown for the identifiers that we track (as of February 6, 2019); these identifiers are non-exclusive (meaning an article with a DOI can also have a PMID and an ArXiv ID, for example).
|Identifier||Number of outputs|
|Harvard Library Open Metadata||784,938|
|National Clinical Trial ID||45,062|
It’s important to track a variety of kinds of persistent identifiers, because it means diversity in the kinds of research outputs we can track. That leads us to little known Facts #2 and #3.
2. Altmetric can track news stories as outputs
Altmetric tracks attention like this to news stories like this one that are published in journals, which sets us apart from other altmetrics data providers. 62.6% of these news stories (70,226 of 113,866 total) come from science communication outlet The Conversation; news stories from AAAS and the Nature family of journals comprise the bulk of the remainder.
In contrast to our approach to tracking journal articles, we only track news stories from select publishers, as publisher websites must include relevant metadata that helps our systems differentiate “research outputs” from “news outputs”. That’s why our News tracking is skewed so heavily towards The Conversation and a small group of other publishers.
There have been 10.9 million mentions of news stories to date, from as far back as January 2001. A majority of the attention we’ve tracked for news stories is from January 2011 onward:
3. Altmetric tracks research before it’s technically published via clinical trial records
Altmetric has captured attention data for 45,062 clinical trial records so far–around 206,000 mentions total. The earliest clinical trial record mention is from 2003; the majority of mentions occur from 2013 onward.
Clinical trials receive a greater proportion of their attention from the news than do other kinds of research outputs.
Though many understand altmetrics to just include social media mentions, Altmetric tracks many kinds of data: from social media to public policy, from peer reviews to mentions in syllabi.
Here’s some insider info on two data sources that you may not be aware we track: mentions on Reddit and in university syllabi!
4. Altmetric tracks sharing across Reddit
Reddit is an online social news aggregation and discussion site, organized into topical boards called “subreddits”. Reddit sees an estimated 11 million posts per month, meaning that the number of research outputs posted to Reddit since 2005 (179,274 mentions, or 1.4% of all Altmetric mentions) is a drop in the bucket compared to the other kinds of content shared on the platform.
Altmetric collects Reddit mentions of research differently than other altmetrics data providers. We count only the posts (that is, when links are shared), rather than the related number of comments or upvotes or downvotes a link receives.
Interesting trends emerge when you look at the research that has been shared on Reddit. A vast majority of content shared on Reddit (77.4%) takes the form of journal articles, followed by news items (21.1%), primarily sourced from Nature News and The Conversation. Books accounted for a little over 1% content shared on Reddit.
Drilling down into the subreddits where research is shared–essentially, communities of interest–there are some fascinating insights to be learnt. A vast majority of mentions occur “in the long tail”; research tends to be shared only a handful of times in each subreddit. Overall, the top five subreddits by mention volume (r/Science, r/citral, r/nsclc, r/todayilearned, and r/statML) comprise only 28.1% of all Reddit mentions for research.
5. Altmetric tracks educational impacts via syllabi mentions
Altmetric tracks the educational impact of books through mining Open Syllabus Project data.
The Open Syllabus Project is “an effort to make the intellectual judgment embedded in syllabi relevant to broader explorations of teaching, publishing, and intellectual history.” The OSP team has collected over 1 million syllabi from universities worldwide. Recently, they received a Digital Science Catalyst Grant to expand their work.
We’ve tracked 412,633 books that have been mentioned in syllabi, in an average of 6.9 syllabi per book. Books mentioned in syllabi were mostly published in the late 1990’s and early 2000’s, but works as old as the 17th century appear in syllabi.
Want to know more?
Check out the Altmetric Support portal for insider intel on Altmetric’s data sources, the outputs we track, and the metadata and identifiers we use to collect and collate research outputs. You can also email us at firstname.lastname@example.org with any and all questions you have about our data!