A large number of stakeholders have accepted the need for greater transparency in clinical research and, in the context of various initiatives and systems, have developed a diverse and expanding number of repositories for storing the data and documents created by clinical studies (collectively known as data objects). To make the best use of such resources, we assert that it is also necessary for stakeholders to agree and deploy a simple, consistent metadata scheme.
The relevant data objects and their likely storage are described, and the requirements for metadata to support data sharing in clinical research are identified. Issues concerning persistent identifiers, for both studies and data objects, are explored.
A scheme is proposed that is based on the DataCite standard, with extensions to cover the needs of clinical researchers, specifically to provide (a) study identification data, including links to clinical trial registries; (b) data object characteristics and identifiers; and (c) data covering location, ownership and access to the data object. The components of the metadata scheme are described.
The metadata schema is proposed as a natural extension of a widely agreed standard to fill a gap not tackled by other standards related to clinical research (e.g., Clinical Data Interchange Standards Consortium, Biomedical Research Integrated Domain Group). The proposal could be integrated with, but is not dependent on, other moves to better structure data in clinical research.