
Scientific integrity forms the basis of trustworthy science. Researchers have a great responsibility to meet these requirements in terms of good scientific practice 1 and the FAIR principles 2, which are also anchored in the Higher Education Act of NRW 3. The FAIR principles impose requirements at a high level of abstraction, both on the actual research data and on the descriptive metadata, to enable researchers to find and meaningfully interpret the reused data.
The following text outlines Coscine’s approach to applying the FAIR data principles for research data managed in Coscine.
Findability
F1. (Meta)data are assigned a globally unique and persistent identifier.
This principle is applied through the use of Coscine . Coscine implements the FAIR Digital Object Concept to represent stored data in concise digital entities. Coscine assigns for each project and resource (including data and metadata) a handle-based ePIC-PID . Individual files are described with metadata profiles, of which each RDF-triple includes PIDs leads to the described data by extending the handle URL.
F2. Data are described with rich metadata (see also R1 below).
This principle must be applied primarily by the users. Coscine supports researchers by providing a rich metadata management environment that gives access to discipline specific metadata profiles. The metadata is automatically checked for completeness based on the refinements of each metadata profile. The richness of research metadata (e.g. choice of metadata profile) lays in the hand of the researchers - they have to select proper domain-specific metadata profiles and fill the fields with rich metadata.
However, when requesting storage space, reviewers check whether the selected metadata profile is (regarding quantity) sufficient for the amount of data added. If researchers use the S3 access to their storage resource and will not fill out metadata profiles in Coscine, all data is required to have at least a README file associated with the files to enable potential re-users to establish whether it is useful.
F3. Metadata clearly and explicitly include the identifier of the data it describes.
This principle is applied by the use of Coscine. In Coscine individual files are described with metadata profiles of which each RDF-triple includes PIDs leads to the described data.
F4. (Meta)data are registered or indexed in a searchable resource.
This principle is partially applied by the use of Coscine. The (meta)data are indexed in an internal search index. Through the FAIR Digital Object realization in Coscine, metadata are linked in the PID record and can be resolved externally. However, fine-grained indexing and findability e.g. via Google dataset Search is not yet supported.
Accessibility
All accessible principles are implemented through our standard web interface and the Coscine API.
A1. (Meta)data are retrievable by their identifier using a standardized communications protocol.
By resolving the PID of a project or resources, users land on a PID contact page and can request an invitation to the associated project. After gaining access to the project via a project owner, users can use different communication protocols to retrieve the (meta)data: either via browser or our REST API with Open API Specification. Depending on the resource type, the data can also be accessed directly via an S3 interface.
A1.1. The protocol is open, free, and universally implementable.
This principle is applied by the use of Coscine. Coscine provides open, free, and universally implementable protocols to access projects, resources, and (meta)data either via browser or our REST API with Open API Specification. Depending on the resource type, the data can also be accessed directly via an S3 interface. On our development roadmap is also the implementation of a linked data platform compliant API and the allowance of SPARQL queries for Coscine users on their project-related resources.
A1.2. The protocol allows for an authentication and authorization procedure, where necessary.
This principle is applied by the use of Coscine. All Coscine REST endpoints support authentication and authorization. Coscine allows project specific access rights (read, write). Applications can operate on behalf of users via a bearer token. The mandatory registration of project participants ensures the authentication of all data owners and contributors for each dataset, while the role management enables the definition of user-specific rights.
A2. Metadata are accessible, even when the data are no longer available.
This principle has been partially applied yet by the use of Coscine. Individual metadata for files is versioned and accessible independently of the file. The PID Kernel Record Information for each project and resource remains accessible even after deletion of the files. This record information contains the following metadata: For private projects: digital object location (Coscine API address) and digital object type (resource or project); for public projects: date created, digital object location (Coscine API address) and digital object type (resource or project), license, contact organization (ROR ID), topic (DFG Fachkollegium). This record information can be resolved via handle.net or the FAIR-DOScope.
Individual metadata is not accessible anymore when the associated resource has been deleted. The development of a tombstone for the Coscine PID landing page, including more metadata of deleted public projects and resources, is on our development roadmap.
Interoperability
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
This principle must be applied primarily by the users. Coscine supports researchers by providing metadata profiles that are machine accessible based on their technical representation and validation via W3C standards RDF and SHACL. However, if researchers decide to use their own metadata submission formats, they are in charge of choosing formal, accessible, shared, and broadly applicable languages. When requesting storage space, technical reviewers will check the chosen languages for metadata and might ask for better languages. For the uploaded data no checks are implemented.
I2. (Meta)data use vocabularies that follow FAIR principles.
This principle is partially applied by Coscine but must also be applied by the users . On the project and resource level, Coscine uses ROR-IDs for identification of participating organizations. Coscine integrates the AIMS platform for the creation of metadata profiles that are linked to terminology services that promote the use of subject-specific terminology. AIMS is funded by the DFG with the aim of harmonizing metadata profiles and making them interchangeable.
Coscine offers metadata profiles with vocabularies that follow FAIR principles, e.g. the EngMeta profile. However, if users create own metadata profiles with individual fields, the usage of FAIR vocabularies is in their hands. When requesting storage space, technical reviewers will check the chosen metadata profile to be appropriately designed regarding vocabularies. For the uploaded data no checks are implemented.
I3. (Meta)data include qualified references to other (meta)data.
This principle must be applied primarily by the users. Users can add corresponding fields to the metadata profiles to refer to other (meta)data. The RDF metadata model used in Coscine directly links the terminologies used and allows traversal according to W3C linked data standards. The expansion of metadata profiles to include fixed references to other (meta)data is on our Coscine development roadmap.
Reuse
R1. Meta(data) are richly described with a plurality of accurate and relevant attributes.
This principle must be applied primarily by the users. The reusability of research metadata (e.g. choice of metadata profile) lays in the hand of the researchers. With the help of all elements, types and attributes of the selected metadata profile, a comprehensive description of the properties and origin of the research data can be achieved. Coscine provides such profiles, but the choice of metadata profile is up to the researcher.
However, when requesting storage space, reviewers check whether the selected metadata profile is (regarding quantity) sufficient for the data added. If researchers use the S3 access to their storage resource and will not fill out metadata profiles in Coscine, all data is required to have at least a README file associated with the files to enable potential re-users to establish whether it is useful.
R1.1. (Meta)data are released with a clear and accessible data usage license.
This principle must be applied primarily by the users. Coscine offers the option to add a license to each resource. However, licenses must be chosen by the users and are not mandatory. If a license is chosen - and the project and resource are public - it will also be added to the PID record information.
R1.2. (Meta)data are associated with detailed provenance.
This principle is partially met by Coscine and parts must be applied by the users. Technical versioning of metadata is via the API implemented. Technical versioning of data will be implemented for the resource types DataStorage.nrw. The description of the workflow that led to the collection and processing of the data is not enforced by Coscine and is in the hands of the researchers when describing the project, the resource and the metadata.
R1.3. (Meta)data meet domain-relevant community standards.
This principle must be applied primarily by the users. Coscine offers a broad range of metadata profiles from different domains, with a technical representation and validation via W3C standards RDF and SHACL. Coscine integrates the AIMS platform for the creation of such metadata profiles that are linked to terminology services that promote the use of subject-specific terminology. The created metadata profiles can be reused by other researchers and are publicly available under an open license. The project and resource-based metadata as well as the most used metadata profile “Base” are based on the discipline agnostic standard DublinCore. For the PID record information a discipline agnostic metadata set has been agreed on to improve the reusability for every community (see A2).
However, as Coscine is a generic platform, researchers must ensure that they select appropriate metadata profiles for their domain or refer to appropriate controlled vocabularies when creating individual metadata profiles.
-
Leitlinien zur Sicherung guter wissenschaftlicher Praxis. Kodex. Deutsche Forschungsgemeinschaft. (2019) ↩︎
-
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). ↩︎
-
NRW HG § 3 Abs. 1 ↩︎