Large, integrated datasets can be used to improve the identification and management of health conditions. However, big data initiatives are controversial because of risks to privacy. In 2014, NHS England launched a public awareness campaign about the care.data project, whereby data from patients' medical records would be regularly uploaded to a central database. Details of the project sparked intense debate across a number of platforms, including social media sites such as Twitter. Twitter is increasingly being used to educate and inform patients and care providers, and as a source of data for health services research. The aim of the study was to identify and describe the range of opinions expressed about care.data on Twitter for the period during which a delay to this project was announced, and provide insight into the strengths and flaws of the project.
Tweets with the hashtag #caredata were collected using the NCapture tool for NVivo. Methods of qualitative data analysis were used to identify emerging themes. Tweets were coded and analysed in-depth within and across themes.
The dataset consisted of 9895 tweets, captured over 18 days during February and March 2014. Retweets (6118, 62 %) and spam (240, 2 %) were excluded. The remaining 3537 tweets were posted by 904 contributors, and coded into one or more of 50 sub-themes, which were organised into 9 key themes. These were: informed consent and the default 'opt-in', trust, privacy and data security, involvement of private companies, legal issues and GPs' concerns, communication failure and confusion about care.data, delayed implementation, patient-centeredness, and potential of care.data and the ideal model of implementation.
Various concerns were raised about care.data that appeared to be shared by those both for and against the project. Qualitatively analysing tweets enabled us to identify a range of concerns about care.data and how these might be overcome, for example, by increasing the involvement of stakeholders and those with expert knowledge. Our findings also highlight the risks of not considering public opinion, such as the potential for patient safety failures resulting from a lack of trust in the healthcare system. However, caution is advised if using Twitter as a stand-alone data source, as contributors may lie more heavily on one side of a debate than another. A mixed-methods approach would have enabled us to complement this data with a more representative overview.