This is the third in a series of blog posts on the role Twitter plays in scholarly communication by, scientometrics researcher, Stefanie Haustein.
It’s in the content and context of tweets that we can often find the most meaning. Unfortunately, most altmetrics research has analyzed counts and correlations rather than tweet content. In this post, I continue to analyze Altmetric data to explain how retweets and hashtags can help us better understand the degree to which users are engaging with research on Twitter.
Of the studies looking at tweet content, one found that the majority contained the title of the article it linked to or summarized it briefly. Another reported that sentiment was mostly absent, and, although most tweets were neutral, the share of positive and negative tweets differed between disciplines. A number of studies focused on analyzing the use of Twitter specific affordances, such as retweets (RTs), user mentions (@mentions) and hashtags (keywords following #). These functions were co-created by users and developers to facilitate communication on Twitter. Studies analyzing affordance use among scholars compared different types of users, disciplines and personal and professional use. They found, for example, that the majority of tweets sent by astrophysicists were conversational, as 61% were retweets, @replies or @mentions. Most of these mentions referred to Twitter accounts of science communicators, other astrophysicists or organizations. Similarly, 72% of tweets sent by Canadian doctoral students in the social sciences and humanities contained other user names, while conversational tweets were less likely to contain URLs. While retweets and tweets with hashtags or links were mostly of professional nature, @mentions appeared more often in US professors’ tweets classified as personal, which suggests that when professors discuss their work, they are less likely to address other users than when they tweet about private matters. In fact, more than 90% of @mentions functioned to address another user, while 5% worked as a reference (Honeycutt & Herring, 2009).
Tweet, tweet, retweet
Retweets are not original contributions to discourse on Twitter, but they do have unique meaning. Like most Twitter affordances, retweets originated within the Twitter user base to facilitate forwarding messages and was only later integrated by Twitter developers in form of the retweet button. Retweets have been described as “internal citations” and, since users forward tweets sent by others, they represent a specific form of information diffusion. Once common in many Twitter bios, the disclaimer, “retweets do not equal endorsements” emphasizes that retweets are diffusing rather than advocating tweet content. Studies examining academics’ Twitter use found that retweeting was less common than other affordances among a group of astrophysicists or that US professors were more than twice as likely to retweet when tweeting professionally. Researchers were more likely to retweet tweets containing links but links to papers—such as those captured by Altmetric—were less likely to be retweeted by a group of 28 scholars.
Nevertheless, the analysis of Altmetric Twitter data shows that retweeting seems to be particularly popular on academic Twitter: Half of the 4 million tweets linking to 2015 WoS papers were retweets (see table below). This suggests that a significant amount of academic tweeting activity focuses on information diffusion, which does not involve much engagement. After all, it only takes the click of a button to retweet another user’s post.
A 50% retweet rate is particularly high when compared to general Twitter users (3%) and also exceeds other studies investigating academic users (between 15% and 37%). It should be noted that general Twitter use analyzed by boyd was based on a random sample of tweets collected in 2009. Among journals with more than 10,000 users tweeting their 2015 articles, the percentage of retweets ranges from 47% for PLOS ONE to 70% for PLOS Biology (see table above). It is also interesting to note that the most active users (based on number of tweets linking to the journal’s articles) often include the journal’s official Twitter account (marked in bold in the table). We will look into who is tweeting scholarly articles in the last post of this mini series.
As we have already seen in the previous blog post, the table above highlights that biomedical journals are particularly popular, as demonstrated by a high Twitter coverage and average number of tweets per document. With retweets exceeding the number of original tweets per scientific specialty, retweeting was particularly common in Miscellaneous Zoology, General & Internal Medicine, Miscellaneous Clinical Medicine and Ecology. At less than 20%, publications in Solid State Physics, Inorganic & Nuclear Chemistry, Chemical Physics and Applied Chemistry were least likely to be retweeted. Comparing the share of retweets with Twitter coverage for 2015 publications per field, retweets are less common in disciplines with low Twitter activity, which suggests that users in these fields do not tweet as much to diffuse others’ information, possibly because they are not well connected.
Japanese tweet on mental health and Fukushima study most retweeted
The most retweeted tweet captured by Altmetric until June 2016 was sent by@takebata, a professor at the University of Hyogo in Japan. It was retweeted 7,126 times by 7,057 users and written in Japanese, allowing for more detailed discussions than languages using Latin characters. The tweet linked to a study in the Community Mental Health Journal showing that giving 50 Euros per month to mental health patients in Sweden significantly improved their depression, anxiety and social life.
The most retweeted scholarly article, regardless of who sent the original tweet, appeared in Nature’s Scientific Reports and demonstrates the nuclear contamination of freshwater fish by the Fukushima accident. The publication was retweeted 15,768 times, 45% of retweets were received by 366 tweets sent by @Lulu__19. Her tweets were also written in Japanese and most retweeting users were from Japan, which is not surprising given that Twitter is the country’s most popular social media platform.
Tweets with conference hashtags enable backchannel discussions
Hashtags enable exchanges among Twitter users with common interests, regardless of whether they follow each other (e.g. #academicswithdogs). They thus allow macro-level communication among users. In academia, hashtags are commonly used at conferences. Tweeting at scholarly conferences has been one of the earliest and most popular uses of Twitter by academics, maybe because tweeting fosters communication among people participating in shared experiences. Almost every scientific conference today has a specific hashtag to connect attendees and remote participants, who are not able to attend in person (e.g., #5amconf and #altmetrics15). Apart from increasing the visibility of conference presentations, tweeting at scientific meetings has introduced another level of communication, creating backchannel discussions online. Due to the ease of collecting tweets with a particular hashtag, there are countless studies on tweets analyzing scholarly Twitter use based on tweets with conference hashtags.
Hashtags seem to be less popular among academics on Twitter, maybe because they are less familiar with this feature or they do not wish to expand conversations beyond their personal publics established through their follower networks. Sixty-one percent of surveyed professors rarely or never used a hashtag, while around a quarter of astrophysicists’ as well as Canadian PhD students’ tweets by contained a hashtag. Inferring hashtag use from most other studies is not possible, as data collection itself is often based on specific hashtags.
#science, #cancer, #physics are most popular hashtags
Just below one third of the 24.3 million tweets in the Altmetric database file made available to researchers contained a hashtag. This hashtag use is much higher than the 5% of a random sample of tweets analyzed by boyd and colleagues, but is comparable to other studies on academic tweets. 401,287 unique hashtags were mentioned 12.6 million times, which amounts to a mean occurrence of 31 hashtag uses per unique term (see table below), but since hashtag frequency is extremely skewed, the standard deviation is high and the median hashtag frequency is as low as only 2. The most popular hashtag occured 162,754 times, while 169,992 hashtags were used in one tweet only. As few as 3% of hashtags make up 80% of occurrences with #science (1.3% of hashtag occurrence), #cancer (0.9%), #physics (0.8%), #openaccess, #health (0.7% each), #paper, #oa and #research (0.5% each) appearing most frequently. The occurrence of #oa as well as #openaccess among the most frequent hashtags reflects the known heterogeneity of folksonomies and the need for tag gardening, where tags with different spellings and abbreviations are combined.
Hashtag use is extremely skewed
We also studied a subset of Web of Science articles published in 2015 when analyzing Altmetric’s data. These Web of Science articles were described using 105,705 unique hashtags, with 6% of all hashtags representing 80% of hashtag uses. Each hashtag was mentioned on average 21 times (mean; median: 2) for a total hashtag frequency of 2.2 million. While 33% tweets contained a hashtag, 46% of all tweeted about articles were described with at least one hashtag. Articles were most frequently tagged with #cancer (1.0%), #health, #openaccess, #science (0.9% each), #FOAMed, #Diabetes, #ornithology and #Psychiatry (0.6% each).
The figure on the right demonstrates (A: for all documents in the Altmetricdata dump; B: Web of Science articles published in 2015) the number of tweets containing a particular hashtag and the number of distinct users mentioning it on a log-log scale. While in general, a log-linear relationship can be found between the number of occurrences and users, a few popular hashtags are tweeted by a limited number of users, which suggests a small but active group of users. The number of users, documents and journals associated with a hashtag can provide information as to how general and widespread a hashtag is, or how specific and relevant to only a small community.
#StandWithPP, #Fit, #dataviz and #coffee most widely spread hashtags
Looking at the number of unique users per hashtag, the largest discrepancy (among hashtags occurring at least 1,000 times) can be observed for #genomeregulation (1,924 tweets; 10 users), #eprompt (2,281; 17) and #cryptocurrency (4,515; 38), which were, on average, tweeted by the same users more than 100 times. On the contrary, the user-hashtag ratio was lowest for #Fit (4,818; 4,743), #StandWithPP (Stand with Planned Parenthood; 1,060; 972), #dataviz (1,010; 912), #coffee (1,517; 1,246, see figure above), and #PWSYN (The Patient Will See You Now; 1,017; 834), which indicates a widespread adoption among Twitter users. Accordingly, these hashtags point to more general, less scientific topics.
How do these findings influence metrics?
When analyzing the Twitter impact of scholarly publications, the way in which they are tweeted should be taken into account. Namely, we should be careful to remember that Twitter is primarily used as a mechanism to diffuse, rather than engage with, research.
The fact that half of all tweets were not original contributions and only involved minimal user engagement—clicking the retweet button—suggests that most tweets simply spread the word about scientific articles, rather than contribute to debate or endorsement of research. Intense discussions about research are the exception rather than the rule. Admittedly, 140 (or, more recently, 280) characters do not provide too much room for in-depth discussions.
The hypothesis that Twitter users diffuse rather than engage with scholarly documents is further supported by the studies that showed that most tweets contain article titles or short summaries and no sentiments. Scholarly Twitter metrics should thus distinguish between different levels of engagement—diffusion, discussion, appraisal—or at least identify the share of original tweets and retweets.
Hashtags are a powerful Twitter feature that enable the exchange among users beyond follower networks. It can be argued that hashtag terms carry special meaning, because they connect users interested in the same topic or event and play a central role in retrieving relevant tweets. Even if statistics on hashtag frequency—particularly in combination with other Twitter data—can reveal some information about how users are tweeting, a qualitative analysis of hashtags is much more meaningful. Provided adequate tag gardening combining various spellings of the same concept, hashtags offer a crowdsourced view on tweet content—and by extension article content—and the context in which scholarly documents are shared on Twitter. Hashtag analysis thus represent a first step towards content analysis and helps to move away from simple counting of mentions towards contextual interpretation of altmetrics data.
Stefanie Haustein is assistant professor at the University of Ottawa’s School ofInformation Studies, whereshe teach research methods and evaluation, social network analysis and knowledge organization. Her research focuses on scholarly communication, bibliometrics, altmetrics and open science. Stefanie co-directs, together with Juan Pablo Alperin, the #ScholCommLab, a research group that analyzes all aspects of scholarly communication in the digital age. Stefanie’s publications can be found on her website. She tweets as @stefhaustein.