Skip to main content

Glossary

Application Profile​

A schema composed of metadata elements drawn from one or more namespaces, as well as policies and guidelines related to their use, prepared for a particular application.

Application Programming Interface (API)​

A code library that enables third-party applications to communicate with a web service platform.

Argument​

A series of reasons, statements, or facts in a metadata schema intended to support or establish a point of view, rather than a neutral statement, that describes a person, event, or object.

Art & Architecture Thesaurus (AAT)​

One of five Getty Vocabularies that contains Uniform Resource Identifiers (URIs) for generic terms related to art, architecture, and visual cultural heritage.

Authority File​

An authority file is a list that contains the authoritative way to reference people, places, things, or concepts, usually as a heading or numeric identifier.

Authority Record​

A stable, persistent Uniform Resource Identifier (URI) for a concept in the Linked Data (LD) ecosystem.

Blank Node​

A subject or object in a Resource Description Framework (RDF) graph for which a Uniform Resource Identifier (URI) or literal is not given.

Canada Foundation for Innovation (CFI)​

A non-profit corporation that invests in research infrastructure at Canadian universities, colleges, research hospitals, and non-profit research institutions.

CIDOC CRM​

A suite of event-centric ontologies for describing data in the cultural heritage domain, developed to link together heterogeneous sets of data managed by museums, galleries, and other heritage institutions.

Classing​

To declare an entity to be an instance of a class using rdf:type within the chosen ontology for the dataset.

Controlled Vocabulary​

A standardized and organized arrangement of words and phrases, which provides a consistent way to describe data.

Conversion​

The process of changing data from one format to another.

Creative Commons (CC)​

A nonprofit organization that provides free licences so people can grant copyright permission to their work in a standardized way.

Crosswalking​

The conceptual process of associating equivalent metadata values or fields from one schema with another.

Cultural Objects Name Authority (CONA)​

One of five Getty Vocabularies that contains Uniform Resource Identifiers (URIs) for titles, creator attributions, physical characteristics, and depicted subjects concerning works of art, architecture, and visual cultural heritage.

Cypher​

A query language for graph databases that reflects the semantic nature of triples but does so with its own unique syntax and formatting.

DBPedia​

A project that creates publicly available structured data for the Linked Open Data (LOD) Cloud.

Dereferenceable​

An adjective used in relation to Uniform Resource Identifiers (URIs) that can turn from an abstract reference into something more concrete, namely a web resource.

Dewey Decimal Classification System (DDC)​

A library classification system that is commonly used by public libraries and small academic libraries to organize print collections.

Digital Humanities (DH)​

A scholarly field in which digital tools and technologies are used to explore humanities research questions.

Digital Humanities Summer Institute (DHSI)​

An annual digital scholarship training institute held at the University of Victoria.

Digital Object Identifier (DOI)​

A type of Uniform Resource Identifier (URI) that is used to uniquely identify various academic, professional, and government information objects, such as journal articles, research reports, data sets, official publications, and videos.

Domain​

One of two entities in a triple, representing the subject in a subject-predicate-object relationship.

Edge​

A line that connects one node to another in a graph database, representing a relationship between the nodes.

Entity​

A discrete thing, often described as the subject and object (or the domain and range) of a triple (subject-predicate-object).

Event-Oriented Ontology​

An ontology that uses events to connect things, concepts, people, time, and place.

A single entry point to access remote SPARQL Endpoints so that a query service can retrieve information from several data sources.

Functional Requirements for Bibliographic Records (FRBR)​

A model developed by the International Federation of Library Associations and Institutions (IFLA) to restructure catalogues around the conceptual structure of WEMI.

Graph Database​

A database that structures information as a graph or network, where a set of resources, or nodes, are connected together by edges that describe the relationships between each resource.

Iconography Authority (IA)​

One of five Getty Vocabularies that contains Uniform Resource Identifiers (URIs) for proper names, relationships, themes, and dates related to iconographical narratives, legendary or fictional characters, historical events, literary works, and performing art.

Inferencing​

The automated discovery of new facts generated from existing triples.

Ingestion​

The process by which data is moved from one or more sources to a new destination where it can be stored and further analyzed.

International Image Interoperability Framework (IIIF)​

A set of tools and standards that make digital images interoperable, providing a standardized method of describing and delivering images online.

Internationalized Resource Identifier (IRI)​

An identifier that builds upon the Universal Resource Identifier (URI) protocol by expanding the set of permitted characters to include most of the Universal Character Set.

JSON​

A human- and machine-readable data interchange format.

Knowledge Graph​

A representation of a set of linked triples that illustrates the relationships between them.

Knowledge Map (ResearchSpace)​

A visualization tool within the ResearchSpace environment that displays the different data entities in the triplestore and how they are connected to other data entities.

Knowledge Pattern (ResearchSpace)​

Predefined graph paths that abstract real-world activities using classes and properties within the ResearchSpace environment.

Library of Congress Classification System (LCC)​

A library classification system that is commonly used by large research and academic libraries to organize print collections.

Linked Data (LD)​

Structured data that is linked with other data through the web and builds upon standard web technologies to share machine-readable data between computers.

Linked Open Data (LOD)​

Data that is linked and uses open sources.

Literal​

An object in a triple that does not refer to a resource with a Uniform Resource Identifier (URI), but instead conveys a value, such as text, a number, or a date.

Mapping​

The conceptual process of associating equivalent metadata values or fields from one schema with another.

Metadata​

Structured information that describes or explains an information object so it can be searched for, retrieved, contextualized, validated, preserved, or managed.

Named Entity Disambiguation (NED)​

To assign a unique identity to an entity in a text to differentiate it from another entity that shares the same name.

Named Entity Recognition (NER)​

The process of identifying and categorizing entitiesβ€”a word or set of words that refer to the same thingβ€”in text.

Named Graph​

An extension of the Resource Description Framework (RDF) data model in which an RDF graph is identified using a Uniform Resource Identifier (URI), thus allowing for the publishing and presentation of metadata about that graph as a whole.

Namespace​

A directory of concepts that are used to identify and refer to entities within a dataset.

Natural Language Data​

Data that is in a free-text format.

Natural Language Processing (NLP)​

A branch of artificial intelligence that involves automatic processing and/or manipulation of speech, text, and other unstructured data forms that represent how humans communicate with each other.

Node​

A representation of an entity or instance to be tracked in a graph database or triplestore, such as a person, object, organization, or event.

NoSQL Database​

A database that is modeled in a way that is different to the tabular relations used in a relational database, such as a document database, key-value store, wide-column store, or graph database.

Object-Oriented Ontology​

An ontology that uses objects to connect things, concepts, people, time, and place.

Ontology​

An abstract, machine-readable model of a phenomenon that captures and structures knowledge of entities, properties, and relationships in a domain so a conceptualization can be shared with and reused by others.

Open Data​

Data that can be accessed, used, and reused by anyone, based on the idea that data should be freely available to everyone to see and make use of, without copyright restrictions.

Open Researcher and Contributor ID (ORCID)​

A not-for-profit organization that provides free Uniform Resource Identifiers (URIs) to researchers so they can be connected to their scholarship and bibliographic output.

Optical Character Recognition (OCR)​

The automatic conversion of images of words to a text file that users can then search and edit.

Property​

A specified relationship between two classes or entities, such as the predicate in a triple (subject-predicate-object).

Property Graph​

A graph where relationships (properties) between entities are named and carry some defined properties of their own, extending the basic graph database of linked triples to show complex connections that describe how different types of metadata relate.

Provenance​

The history of ownership, custody, or location of an object being described or the data describing that object.

QName​

A shorthand string that is used to substitute a Uniform Resource Identifier (URI) reference.

Quad​

An extension of a triple to include a fourth section that provides context for the triple, such as the Uniform Resource Identifier (URI) of the graph as a whole (subject-predicate-object-context).

Query Variable​

A proxy for the object you are searching for when constructing a SPARQL query.

Range​

One of two entities in a triple, representing the object in a subject-predicate-object relationship.

Reconciliation​

The process of ensuring that an entity in a dataset refers to a stable Uniform Resource Identifier (URI), ideally from a stable namespace, to make data more accessible, interoperable, and efficient when searching, storing, and retrieving.

Regular Expression (Regex)​

A syntax that can be used in programming languages to find, manipulate, or replace patterns in texts.

Reification​

The process of making an abstract concept concrete, such as taking the notion of a relationship and viewing it as an entity or expressing something using a programming language, so it can be programmatically manipulated.

Relation Extraction (RE)​

The task of detecting, classifying, and extracting semantic relationships from a text.

Relational Database​

A database that stores data in tabular form (columns and rows), where columns hold attributes of data (such as datatypes), rows hold β€œrecords,” and relationships are defined between tables using rules.

Research Data Management (RDM)​

The processes and activities performed by researchers throughout the lifecycle of a research project to guide the collection, organization, documentation, storage, accessibility, reusability, and preservation of data.

Resource Description Framework (RDF)​

A standard for Linked Data (LD) that represents information in a series of three-part β€œstatements” called a triple, which comprises a subject, predicate, and an object in the form subject-predicate-object.

Resource Description Framework (RDF) Serialization​

A syntax that can be used to write triples, including Turtle (TTL), XML (XML-RDF), and JSON.

Resource Description Framework Schema (RDFS)​

An extension of the basic Resource Description Framework (RDF) vocabulary that can be used to define the vocabulary (terms) to be used in an RDF graph.

Semantic Narrative (ResearchSpace)​

An interactive document within the ResearchSpace environment that combines textual narrative and Linked Data (LD) to communicate ideas about people, places, and events.

Semantic Web​

The idea of extending the World Wide Web by including additional data descriptors to web-published content so computers can make meaningful interpretations of the published data.

Semi-Structured Data​

Data where there is some structure but not in a way that makes it easy to extract entities and relationships without manual work.

Shape Expressions (ShEx)​

A language for validating and describing Resource Description Framework (RDF) graph structures.

Shapes Constraint Language (SHACL)​

A standard for describing Resource Description Framework (RDF) graphs and validating them against a set of conditions.

Simple Knowledge Organization System (SKOS)​

A standard that provides a way to represent thesauri, taxonomies, and controlled vocabularies following the Resource Description Framework (RDF).

SPARQL Endpoint​

A location on the internet that is identified by a Uniform Resource Locator (URL) and is capable of receiving and processing SPARQL queries, allowing users to access a collection of triples.

SPARQL Protocol and RDF Query Language (SPARQL)​

A query language for triplestores that translates graph data into normalized, tabular data with rows and columns.

Structured Data​

Data that takes the form of spreadsheets, relational databases, JSON files, RDF files, and XML files.

Structured Query Language (SQL)​

A query language for relational databases that allows users to specify what data to return, what tables to look in, what relations to follow, and how to order the data that meets these set conditions.

Taxonomy​

A system that identifies hierarchical relationships among concepts within a domain.

TEI Data​

Data that follows the guidelines of the Text Encoding Initiative (TEI).

Text Encoding Initiative (TEI)​

An encoding language that supports detailed encoding of complex documents and is used extensively by a number of different Digital Humanities (DH) projects.

Thesaurus​

A structured vocabulary that shows hierarchical, associative, and equivalence relationships between concepts, so users not only will find terms that are broader and narrower than others, but also terms that are synonymous, antonymous, or otherwise related (associated) in a defined manner.

Thesaurus of Geographic Names (TGN)​

One of five Getty Vocabularies that contains Uniform Resource Identifiers (URIs) for names, relationships, place types, dates, and coordinates.

THINC Lab​

A space at the University of Guelph that supports collaborative, interdisciplinary, and digital humanities research.

Triple​

A statement in the form of subject-predicate-object that follows the Resource Description Framework (RDF).

Triplestore​

An NoSQL database that stores triples.

Turtle​

A human- and machine-readable markup language that allows users to serialize triples.

Typing​

To use a standardized path to link any entity to an external vocabulary, thesauri, or ontology within CIDOC CRM.

Uniform Resource Identifier (URI)​

A reliable and usable way to identify a unique entity so multiple datasets from various sources can communicate that they are all referring to the same thing.

Uniform Resource Identifier (URI) Minting​

The process of creating a new Uniform Resource Identifier (URI) to represent an entity.

Uniform Resource Locator (URL)​

A statement that describes the location of something on the web specifically for locating online assets.

Union List of Artist Names (ULAN)​

One of five Getty Vocabularies that contains Uniform Resource Identifiers (URIs) for names, relationships, and biographical information concerning people and corporate bodies related to art, architecture, and other material culture.

Virtual International Authority File (VIAF)​

A service that aggregates the catalogues of many national libraries and miscellaneous authority files.

Vocabulary​

A collection of terms that could be concretely described in an ontology, taxonomy, or thesaurus.

Web Annotation​

An online annotation of a web resource that denotes a connection between different resources.

Web Annotation Data Model (WADM)​

A standard for the formatting and structuring of web annotations.

Web Ontology Language (OWL)​

A knowledge representation language for ontologies that explicitly represents the meaning of terms in vocabularies and the relationships between those terms, as well as between groups of terms.

WEMI​

An acronym standing for Work, Expression, Manifestation, and Itemβ€”terms deriving from Functional Requirements for Bibliographic Records (FRBR), which is the primary means of describing bibliographic records.

Wikibase​

A suite of free and open-source knowledge-base software for storing, managing, and accessing Linked Open Data (LOD), written for and used by the Wikidata project.

Wikidata​

The largest instance of Wikibase, which acts as a central storage repository for structured data used by Wikipedia, by its sister projects, and by anyone who wants to make use of a large amount of open general-purpose data.

Wikimedia Commons​

A media file repository (images, sounds, and video clips) that makes public domain and freely licensed media content available, and which acts as the digital asset manager for all projects of the Wikimedia Foundation.

Wikimedia Foundation​

The umbrella organization that manages Wikipedia, Wikibase, Media Wiki, Wiktionary, and other Wiki projects and chapters.

World Wide Web Consortium (W3C)​

An international community that works to develop Web standards.

XML​

A human- and machine-readable markup language that allows users to create their own tags to describe documents.