Validate and Enhance
Introduction
In this step, you check that your converted data meets the ontological and LOD standards needed for its inclusion in the LINCS Knowledge Graph.
Now that you have converted your data into RDF, we can validate and enhance your data in the same way regardless of what conversion workflow you followed.
Resources Needed
This step is a joint effort between LINCS and your research team. Your team should make an initial attempt at validating and enhancing your converted data. When you think it is ready for the LINCS Knowledge Graph, or if you need help before that point, send your converted data to LINCS and we will do an additional review of the data.
Some basic programming experience (e.g., undergraduate level Python) can make this step easier. LINCS has also made some common validation and enhancement steps easier with the tools discussed below.
The time needed for this step depends on how ready your data is when it comes out of the Implement Conceptual Mapping step. Sometimes there are no errors to fix and it is only a matter a few hours of checking the data and minting entity Uniform Resource Identifiers (URIs). Other times you will find errors that trace back to your original data or to a certain conversion step and will need to spend a few weeks consulting with your team, making edits, and re-checking the data.
Research Team | Ontology Team | Conversion Team | Storage Team | |
---|---|---|---|---|
Handle Data Changes | ✓ | ✓ | ||
Validate and Enhance Converted Data | ✓ | ✓ | ✓ | ✓ |
Enhance Converted Data | ✓ | ✓ | ✓ | ✓ |
Use Tools | ✓ | ✓ | ✓ | ✓ |
Handle Data Changes
If you find errors in this step and want to change your data, you have a few options:
- Change the RDF directly by editing the TTL file or by using the editing software of your choosing.
- For small changes, this could be done by hand.
- For bulk changes, we recommend writing a simple script to make the changes or to use our Linked Data Enhancement API.
- Remember that if you make changes to the RDF by hand, then you should not re-run the conversion workflow on the same data or your manual changes may be overwritten.
- Make notes of the changes needed and wait to implement those changes until the data is in ResearchSpace.
- Make the changes to the source data or the conversion step that introduced the error and rerun the conversion workflow until the errors are gone.
Validate Converted Data
Below are validation steps you should perform on your converted data. It is best to do these checks on a combined version of your data where all of the triples are in a single TTL file so that you know if there is missing information or a logical inconsistency across the entire dataset.
Entity Labels
Requirements:
- Every URI in your data must have at least one an
rdfs:label
value.- The exception is if a URI is being used only as the object of an
owl:sameAs
relationship or is ofrdf:type crm:E73_Information_Object
. In these cases, the URI is being used as a link to an external source and is not meant to represent a searchable entity in ResearchSpace. - When using external vocabulary terms in your data, add
rdfs:label
andrdf:type
values for those terms from their source vocabularies to your data so that you can query your data without needing to pull from additional sources at the time of querying.
- The exception is if a URI is being used only as the object of an
Suggestions:
- You can add additional
rdfs:label
values for a single entity. - You can add additional labels using
skos:altLabel
orskos:prefLabel
to specify that there is a label or preference that is specific to your project. - Whenever possible, include at least one English
rdfs:label
and one Frenchrdfs:label
. - Whenever possible, add a language tag for each label and literal value (e.g.,
"label"@en
and"étiquette"@fr
). - Try to use the same label formats as are used in existing LINCS data. See LINCS LOD Style Guide (Coming Soon) which will for formats for such labels.
Entity Types
Requirements:
- Every URI in your data must have at least one
rdf:type
value declared.- The exception is if a URI is being used only as the object of an
owl:sameAs
relationship.
- The exception is if a URI is being used only as the object of an
- The rest of the guidelines here will be specific to the conceptual mapping developed for your data.
URI Validation
Requirements:
- Verify that the URIs in your data follow the correct format for each URI source. See common mistakes in the URIs and Prefixes section of our Data Cleaning Guide.
Ontological Validation
- Verify that the relationships present in your converted data match the mappings created in your Develop Conceptual Mapping step.
- Check back soon for details on future LINCS tools to help you validate CIDOC CRM data.