Gaining deeper insights into Twitter conversations

Mike Taylor
Head of Data Insights at Digital Science
Two white speech bubbles on a pink background.

Altmetric has been tracking mentions of research publications and clinical trials on Twitter for over a decade now – and in that time, with our collection of Tweets now at 200M. While sometimes we note that traffic drops (for example, during traditional holiday periods), our data shows the growth in Twitter activity has been relentless! Even Elon Musk acquiring Twitter at the end of 2022 has barely registered in the volume of people sharing research. 

Throughout this time, Altmetric has invested in ways to help you make sense of the volume of conversations on social media, and elsewhere. Nowadays, you can use Explorer to analyse publications – and their social reach – in a variety of ways. We support analysis by funder and institution, subject area and Sustainable Development Goal (SDG), by publication type and over time.

All these tools are invaluable for analysing trends in Altmetric data, but how can you understand not only who is sharing your research, but how is it being discussed?

Over the last 18 months, Altmetric Data Insights (ADI) has been developing bespoke tools to answer some of these questions. As a result of our work, we now have three new ways to understand Twitter activity – and we’re going to be rolling this out across new attention sources over the coming months.

1. Demographics

Firstly, we can now  categorise Twitter accounts into a set of demographics. For example, someone who identifies as a ‘cancer surgeon’ can be classified as ‘oncologist’, ‘surgeon’, ‘health care provider’ – and so on. And we can add more information, so their sharing history can also be analysed. Obviously some categories are easier to identify than others: an oncologist is a pretty clear-cut description! What’s harder to deal with are descriptions such as “parent to” or “mom of” – especially when terms such as “cat-mom” or “dad to two unruly kids, two cats and a dog”! For this, we’ve implemented a machine learning algorithm that can help us narrow down our classification, for example, “parents of (human) children”.

2. Mapping researchers

Our second additional insight has been to map researchers against Twitter accounts. This is something that we battled with for several years – but last year, we had a moment of wisdom (just the one!) – and were able to efficiently and accurately map over half-a-million accounts. Now for the first time, we can analyse cohorts of researchers and map it against their online research sharing activity.

3. Sentiment Analysis

Our third (and final!) improvement to understanding the big “what” and “who” questions was to develop and deploy sentiment analysis over our social media coverage. For those of you who have been following this endeavour, most research in this space has used off-the-shelf sentiment analysis – which revealed that most research-linked tweets were neutral. To finally crack this problem, we developed our own sentiment analysis tool. With this, we were able to analyse the content on the basis of whether it was recommending the research or not. In our model, sentiment is labelled across seven levels of support towards the research output being shared, for example:

-3: Strong negative: “This paper is completely biassed”

-2: Weak negative: “This is preprint so buyer beware but hopefully it holds up.”

-1:  Unclear negative: “Oh boy”

0: Neutral: “https://t.co/u8hSn3x5Lu”

1: Unclear positive: “COVID-19 diagnosis and management: a comprehensive review. https://t.co/n3WGYvwwHA”

2: Weak positive: “New study from Brazil finds “regular use of ivermectin as a prophylactic agent was associated with significantly reduced COVID-19 infection, hospitalization, and mortality rates.” https://t.co/vRjVHAb09s”

3: Strong positive. 

“Amazing paper”

Having access to these numerics means we now have access to tweet-level sentiment data, that we can now segment and incorporate in our visualisations and reports. The below example explores the sentiment around cryptocurrency sub-themes:

Combining with the demographics discussed above, we can now, for example, segment by stakeholder groups:

What we found was truly interesting: far from being neutral, the sentiment expressed towards the paper is generally positive, containing discussions and recommendations to read the research. There’s not much negative sentiment – we appreciate this might surprise you! -But when people are being negative about research, they tend not to share the research paper/article. That said, there are some very polarised conversations linked to research, for example, research into the carbon footprint of cryptocurrency has extremes from both sides. 

We’ve talked about both the demographics and the sentiment analysis at some length in a webinar that’s available on-demand, and we’ll be talking about the profile mapping later in the year.

While we’re not yet planning on incorporating our work into Altmetric Explorer, we are using this data for research (yet to be published) and for our custom dashboard clients. We also plan to extend this approach to other social media channels, and our extensive news and blog collection.

As always, if you’d like to know more (or just fancy a chat), please contact us, using the link below.

Mike Taylor, Head of Data Insights at Digital Science + Carlos Areia, Data Scientist at Altmetric.

Talk to our friendly team to find out about how we can support you.