FOR SHARE - JULY SNAPSHOT - Data Interoperability category description
Early draft
Please be aware that the Data Spaces Blueprint content shared in these pages are a very early draft published on 2023-07-01. The current draft is incomplete and the content might still change.
SAVE-THE-DATE 01-10/09/2023: We will welcome your feedbacks to future improve the Data Spaces Blueprint during the Public consultation that will open on September the 1st 2023 until September the 10th. Please mark these dates in your calendar and get ready!
Introduction
Ensuring clear understanding of data is crucial to ensure that data can be accurately and consistently interpreted and used by various individuals and systems. Uncertainty regarding the meaning of shared data can lead to miscommunication and misinterpretation, resulting in errors and poor decision-making. Consequently, organizations faced with ambiguous data invest significant effort in aligning that data with the formats and structures expected by their IT systems. This process is both time-consuming and expensive, highlighting how the absence of shared data meaning is a major obstacle to data sharing and the realization of data spaces.
The ability of different systems, applications, or individuals to exchange and use data seamlessly is known as data interoperability. It ensures that the data can be shared, understood, and utilized effectively across various technologies, organizations, domains, and data spaces. This building block category is about the semantic layer of data interoperability.
Goal
In data spaces, it necessitates that technologies, applications, or members are able to seamlessly exchange data, which allows for improved collaboration and efficiency among individuals, organizations, and systems. In the absence of data interoperability, the data formats and understanding varies between organisations, individuals, and systems. This incompatibility hampers the benefits of data exchange as aligning with ambiguous data becomes an expensive and time-consuming process. Therefore, it requires the adoption of a common language.
The objective of this building block category is to provide a framework, components and best practices that data spaces can use to ensure data interoperability.
Building Blocks
This section provides a brief description of the building blocks within this category.
Data Models & Formats building block: Semantic interoperability is of great importance for a data space. A data space requires participants to be able to understand each other in order to provide value to the data. This requires a common language for the level of semantic interoperability that is required to be able to automatically understand and exchange digital resources. To semantically annotate the data that is being shared, a data space initiative requires a domain specific vocabularies that express the semantics, e.g., an ontology with a shared conceptualization of a particular domain of knowledge.
Data Exchange building block: Exchange APIs are a set of rules and guidelines that govern the exchange of data between two or more components in a particular domain or application. It is about the specifications of the data plane of data exchange. Recall that the other building block ‘Data Models & Formats’ describes the modelling of the knowledge and information within the considered domain by describing which information plays what role. Data Exchange, on the other hand, is about describing how these semantic data models are used in the interaction between systems, and/or data members, and between systems themselves
Provenance and traceability building block: This building block describes how the data sharing process is being monitored within a data space. Especially data spaces with highly regulated data, it is necessary to make the data sharing process observable. This can be done for legal reasons to prove that data has been processed only by authorized entities, or for business reasons to provide a marketplace and billing function through a trusted third party. Observable data is important for ensuring that data is being used in a way that is consistent with the contractual arrangements and governance frameworks. By making the data sharing process observable, it is possible to monitor data usage and ensure that the data is only being accessed by the authorized parties. Provenance and traceability relates to the administrative part of the transaction.
Interrelationships
An overview of how the building blocks within this category are relate. This should include any dependencies, shared functionalities, and potential conflicts with other building blocks.
Semantic annotations in meta data brokering and market place
In the context of metadata brokering services, self-descriptions are being used to describe the capabilities, functionalities, data resources, and other relevant information about a service. It is of great importance that users have a common understanding of these services without direct interaction. Meta brokering falls under the category of "Data value creation" building blocks, but it is also connected to other building blocks within this category. It involves creating self-descriptions that include references to common vocabularies to accurately describe the meaning of the data or payload. For instance, the use of DCAT-AP allows for referencing the data resource's structure.
Access and usage policies
This is still ongoing research about how to use semantics to describes usage policies in a machine-readable and understandable manner. These vocabularies could provide a common language for expressing access control rules, data usage restrictions, privacy guidelines, and other policies related to data access and handling
Adoption of common vocabularies - governance mechanisms
The implementation of governance mechanisms is essential to ensure that a semantic standard effectively serves the needs of the community and continues to do so.
Future outlook
A list of open challenges, research questions or implementations related to the building block. This include any high-level technical or conceptual challenges that need to be addressed in order to achieve the building block's main functions, or any research questions that need to be explored further.
Towards a federation of data spaces
In case of partial harmonization, data space specific transactions need to be translated into harmonized equivalents. This translation process enables interoperable transactions and promotes a common understanding of concepts. This introduces a whole new concept to the context of semantics, and for this building block category there is a gap how to achieve federation of data spaces.
Adoption of common vocabularies
There are inherent limitations to the extent of semantic interoperability that can be attained. Each member within a business ecosystem operates with a unique perspective. These differences arise due to jurisdictional variations, diverse domains, distinct business processes, target markets, offered services, and other factors. The high diversity among ecosystem members necessitates flexibility within the semantic standards to allow for some integration efforts. Conversely, lower diversity allows for more stringent semantic standards, fostering greater uniformity and enabling more efficient data sharing and automation.
In any case, the implementation of governance mechanisms is essential to ensure that a semantic standard effectively serves the needs of the community and continues to do so.
References
[1]: European Commission, Directorate-General for Informatics, New European interoperability framework : promoting seamless services and data flows for European public administrations, Publications Office, 2017, https://data.europa.eu/doi/10.2799/78681