Here in the UK, HEFCE (the Higher Education Funding Council for England, which distributes central funding to English universities) is currently running an independent review on the role of metrics in research assessment.
As part of that, a couple of weeks ago the review panel convened a workshop at the University of Sussex: In Metrics We Trust? Prospects & pitfalls of new research metrics. I was lucky enough to attend and thought it was a really useful day, not least because it was a chance to hear some pretty compelling points of view.
I’m excited that altmetrics are in the mix of things being considered, and that time is being taken to carefully assess where metrics in general may be able to help with assessment as well as, probably more importantly, where they can’t.
How can altmetrics be used in a REF like exercise?
Before anything else, here’s my perspective on the use of altmetrics data in the context of REF style formal assessment exercise (there are lots of other uses within an institution, which we shouldn’t forget. Research isn’t all about the post publication evaluation, even if it sometimes feels that way).
When I say “altmetrics data” I mean the individual blog posts, newspaper stories, policy documents etc. as well as their counts, the number of readers on Mendeley etc. Not just numbers.
- If we’re going to look at impact as well as quality, we must give people the right tools for the job
- Numbers don’t need to be the end goal. They can be a way of highlighting interesting data about an output that is useful for review, with the end result being a qualitative assessment. Don’t think ‘metrics’ think ‘indicators’ that a human can use to do their job better & faster
- On that note, narratives / stories seem like a good way of addressing a broad concept of impact.
- Altmetrics data can help inform and support these stories in two main ways.
- Figuring out which articles have had impact and in what way, then finding supporting evidence for it manually takes a lot of effort. How do you know what direction to take the story in? Automatically collected altmetrics indicators could save time and effort, showing areas that are worth investigating further. Once you have discovered something interesting altmetrics can help you back up a story with the quantitative data.
- They may also highlight areas you wouldn’t otherwise have discovered without access to the data. For example, altmetrics data may surface attention from other countries, sources or subject areas that you wouldn’t have thought to search for
Using altmetrics data to inform & support: an example
Alice is an impact officer at a UK university. She identifies a research project on, say, the contribution of climate change to flood risk in the UK that is a good candidate for an impact case study.
She enters any outputs – datasets, articles, software, posters – into an altmetrics tool, and gets back a report on the activity around them.
On a primary research paper:
… she can quickly some uptake in the mainstream media (the Guardian, the New York Times) and magazines (Time, New Scientist). She can see some social media activity from academics involved in the HELIX climate impacts project at Exeter, a Nature News correspondent, the science correspondent for Le Monde and the editor for CarbonBrief.org.
Switching to the policy side she can see that there are two citations tracked from government / NGO sources: a report from the Environment Agency and one from Oxfam.
These are documents from UK organizations that Alice’s institution may have already been tracking manually. But research, even research specifically about the UK, can be picked up worldwide:
For example above by the AWMF, which is similar to NICE in the UK.
Alice can support her assessment of what it all means with other indicators: by checking to see if it’s normal for papers on anthropogenic climate change and flood risks to get picked up by the international press. She can see how the levels of attention compare to other articles in the same journal.
She can do all this in five minutes. It doesn’t help with the next, more important part: Alice now needs to go and investigate if anything came of that attention, how the report from the Environment Agency used the article (in this case, only to show that research is still in the early stages), if the report was used if at all, whether or not anything came out of the interest from journalists. She still needs to speak to the researcher and do the follow up. The altmetrics data, though, gave her some leads and a running start.
Because she’s supported by the right tools and data she can get relevant data in five minutes.
As time goes on and the relevant tools, data sources and our understanding of what kinds of impact signals can be picked up and how improves, so will the usefulness of altmetrics.
Why would it ever be useful to know how many Facebook shares an article got?
In the example above we talk about news mentions and policy documents. Facebook came up in the panel discussion.
If you have a ten papers and the associated Facebook data it would be a terrible, terrible idea for almost any impact evaluation exercise to use metrics as an end point and, say, rank them by the number each one was shared, or their total Altmetric score or something. On this we should all be agreed.
However, if nine papers have hardly any Facebook data associated with them, and one has lots, you should check that out and see what the story is by looking at who is sharing about it and why, not ignore the indicator on the principle that you can’t tell the impact of a work from a number. The promise of altmetrics here is that they may help you discover something about broader impact that you wouldn’t otherwise have picked up on, or to provide some ‘hard’ evidence to back up something you did pick up on some other way.
There are lots of ways in which indicators and the underlying data they point to can be used to support and inform assessment. Equally there are many ways you can use metrics inappropriately. In my opinion it would be a terrible waste – of potential, but also time and money – to lump these together with the valid uses and suggest that there is no room in assessment for anything except unsupported (by tools and supplementary data) peer review.
What’s in a name? That which we call a metric…
One opening statement at the workshop that particularly struck a chord with me was from Stephen Curry – you can find a written version on his blog. Stephen pointed out that ‘indicators’ would be a more honest word than ‘metrics’ considering the semantic baggage it carries:
I think it would be more honest if we were to abandon the word ‘metric’ and confine ourselves to the term ‘indicator’. To my mind it captures the nature of ‘metrics’ more accurately and limits the value that we tend to attribute to them (with apologies to all the bibliometricians and scientometricians in the room).
I’ve changed my mind about this. Before I would have suggested that it didn’t really matter, but I now agree absolutely. I still think that debating labels can quickly become the worst kind of navel gazing…. but there is no question that they shape people’s perceptions and eventual use (believe me, since starting a company called “Altmetric” I have become acutely aware of naming problems).
Another example of names shaping perception came up at the 1:AM conference: different audiences use the word “impact” in different ways, as shorthand for a particular kind of influence, or as the actual, final impact that work has in real life, or for citations, or for usage.
During the workshop Cameron Neylon suggested that rather than separate out “quality” and “impact” in the context of REF style assessment we should consider just the “qualities” of the work, something he had previously expanded on in the PLoS Opens blog:
Fundamentally there is a gulf between the idea of some sort of linear ranking of “quality” – whatever that might mean – and the qualities of a piece of work. “Better” makes no sense at all in isolation. Its only useful if we say “better at…” or “better for…”. Counting anything in isolation makes no sense, whether it’s citations, tweets or distance from Harvard Yard. Using data to help us understand how work is being, and could be, used does make sense.
I really like this idea but am not completely sold – I quite like separating out “quality” as distinct to other things because frankly some qualities are more equal than others. If you can’t reproduce or trust the underlying research then it doesn’t matter what audience it reached or how it is being put into practice (or rather it matters in a different way: it’s impact you don’t want the paper to have).
Finally, I belatedly realized recently that when most people involved with altmetrics talk about “altmetrics” they mean “the qualitative AND quantitative data about outputs” not “the numbers and metrics about outputs”, but that this isn’t true outside of the field and isn’t particularly intuitive.
We’ve already started talking internally about how to best tackle the issue. Any suggestions are gratefully received!