FOR SHARE - JULY SNAPSHOT - Data Models and formats - Two pages descriptions

Created by
Last updated: 3 July 2023, 09:56

Early draft

Please be aware that the Data Spaces Blueprint content shared in these pages are a very early draft published on 2023-07-01. The current draft is incomplete and the content might still change.

SAVE-THE-DATE 01-10/09/2023: We will welcome your feedbacks to future improve the Data Spaces Blueprint during the Public consultation that will open on September the 1st 2023 until September the 10th. Please mark these dates in your calendar and get ready!

Overview

Semantic interoperability is of great importance for a data space. A data space requires participants to be able to understand each other in order to provide value to the data. This requires a common language for the level of semantic interoperability, which is required to be able to automatically understand and exchange digital resources. To semantically annotate the data being shared, a data space initiative requires domain-specific vocabularies that express the semantics, e.g., an ontology with a shared conceptualisation of a particular domain of knowledge.

This building block allows

  1. the interchange of data by providing the information about the shared data structure.

  2. the systematic and automated publication of data assets by providing accurate descriptions of their structure.

  3. systematic and automated search by using standardized data structure descriptions.

  4. data integration and mapping of data elements from different sources by using a common vocabulary.

  5. future-proofing and extending unified information models to accommodate new data elements, relationships, and attributes as the domain or industry evolves.

Key elements

For this building block we distinguish three key concepts:

  1. Vocabularies (artifacts): common language to facilitate semantic interoperability in a data space, incl. ontologies, data models, schema specifications, mappings and API specifications that can be used to annotate and describe data sets and data services.

  2. Vocabulary provider (role): an entity that is responsible for providing (creating, publishing, maintaining) the vocabularies. In the context of a data space this role is often fulfilled collectively by business communities and delegated to some sort of standards development organisation (SDO).

  3. Vocabulary hub (component): component providing facilities for publishing, editing, browsing and maintaining vocabularies and related documentation.

What to specify for each of these concepts?

For these concepts the scope of this building block is to:

  1. Vocabularies: specify recommended and/or mandatory open metamodel standards to which vocabularies need to comply. Including, but not limited to:

    • RDFS/OWL/SHACL/JSON-LD for semantic web ontologies

    • JSON schema for JSON-oriented data models

    • XML Schema, Schematron for XML oriented data models

    • CSVW for CSV oriented tabular data

    • XSLT, R2RML, RML, YARRRML, CSVW for data transformation specifications

  2. Vocabulary providers: this building block does not provide guidance or specifications for the role of vocabulary providers. We recognize the importance of quality, continuity and support of vocabularies that needs to be organized in some way, but consider this out-of-scope.

  3. Vocabulary hubs: this building block will specify the basic functionalities (publish, browse, edit, maintain) that a vocabulary hub component must provide. Furthermore we specify the interrelationships of the vocabulary hub with other building blocks / components in the data space.

Key functions

A list of the main functions that the building block needs to fulfill, along with a brief description of each function.

Vocabularies

  • Function 1: List the name to the attributes, their description and their data types

  • Function 2: To provide references to the linked data URI of the different elements of the data models

  • Function 3: Easy description (do not impose a technical barrier) of the data structures 

Vocabulary Hub

  • Function 4: Easy creation (do not impose a technical barrier) of standardized data models when the data assets were not compliant with any standard

  • Function 5: Integration and alignment of multiple data models to address overlapping or related concepts

  • Function 6: Browsing and visualization features to allow for the exploration and understanding of data models

Dependencies and relationships

This building block establishes a common format for data model specifications and representation of data in data exchange payloads. Combined with the Data Exchange building block, this ensures full interoperability among participants.

Furthermore, the data models can be used by and referenced to in the self descriptions of the data- and data service offerings. Thereby allowing for better search and discovery of relevant data offerings. This is addressed in the building block Data, Services and Offerings descriptions.

Relevance to the data space

 In order to allow a automatic data interchange there should be automated the ingestion and description of the data assets of the data space. Therefore automatic integration with other blocks of the data space is a mandatory requirement for seamless operation of the data space.

Additionally the ‘descriptions of the data structure’ of non standardized datasets could become an incipient mechanism to launch potential new standards.