↓ Skip to main content

Developing a Workflow to Identify Inconsistencies in Volunteered Geographic Information: A Phenological Case Study

Overview of attention for article published in PLOS ONE, October 2015
Altmetric Badge

About this Attention Score

  • In the top 5% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (93rd percentile)
  • High Attention Score compared to outputs of the same age and source (93rd percentile)

Mentioned by

news
4 news outlets
twitter
1 X user

Citations

dimensions_citation
18 Dimensions

Readers on

mendeley
58 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Developing a Workflow to Identify Inconsistencies in Volunteered Geographic Information: A Phenological Case Study
Published in
PLOS ONE, October 2015
DOI 10.1371/journal.pone.0140811
Pubmed ID
Authors

Hamed Mehdipoor, Raul Zurita-Milla, Alyssa Rosemartin, Katharine L. Gerst, Jake F. Weltzin

Abstract

Recent improvements in online information communication and mobile location-aware technologies have led to the production of large volumes of volunteered geographic information. Widespread, large-scale efforts by volunteers to collect data can inform and drive scientific advances in diverse fields, including ecology and climatology. Traditional workflows to check the quality of such volunteered information can be costly and time consuming as they heavily rely on human interventions. However, identifying factors that can influence data quality, such as inconsistency, is crucial when these data are used in modeling and decision-making frameworks. Recently developed workflows use simple statistical approaches that assume that the majority of the information is consistent. However, this assumption is not generalizable, and ignores underlying geographic and environmental contextual variability that may explain apparent inconsistencies. Here we describe an automated workflow to check inconsistency based on the availability of contextual environmental information for sampling locations. The workflow consists of three steps: (1) dimensionality reduction to facilitate further analysis and interpretation of results, (2) model-based clustering to group observations according to their contextual conditions, and (3) identification of inconsistent observations within each cluster. The workflow was applied to volunteered observations of flowering in common and cloned lilac plants (Syringa vulgaris and Syringa x chinensis) in the United States for the period 1980 to 2013. About 97% of the observations for both common and cloned lilacs were flagged as consistent, indicating that volunteers provided reliable information for this case study. Relative to the original dataset, the exclusion of inconsistent observations changed the apparent rate of change in lilac bloom dates by two days per decade, indicating the importance of inconsistency checking as a key step in data quality assessment for volunteered geographic information. Initiatives that leverage volunteered geographic information can adapt this workflow to improve the quality of their datasets and the robustness of their scientific analyses.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 58 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Spain 1 2%
United States 1 2%
Ireland 1 2%
Unknown 55 95%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 15 26%
Student > Bachelor 9 16%
Student > Master 9 16%
Researcher 8 14%
Other 3 5%
Other 7 12%
Unknown 7 12%
Readers by discipline Count As %
Earth and Planetary Sciences 10 17%
Computer Science 8 14%
Agricultural and Biological Sciences 8 14%
Engineering 7 12%
Environmental Science 5 9%
Other 10 17%
Unknown 10 17%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 32. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 04 December 2015.
All research outputs
#1,041,534
of 22,830,751 outputs
Outputs from PLOS ONE
#14,021
of 194,862 outputs
Outputs of similar age
#17,075
of 283,131 outputs
Outputs of similar age from PLOS ONE
#352
of 5,543 outputs
Altmetric has tracked 22,830,751 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 95th percentile: it's in the top 5% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 194,862 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 15.1. This one has done particularly well, scoring higher than 92% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 283,131 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 93% of its contemporaries.
We're also able to compare this research output to 5,543 others from the same source and published within six weeks on either side of this one. This one has done particularly well, scoring higher than 93% of its contemporaries.