FOR SHARE - JULY SNAPSHOT - Data, Services and Offerings - Two pages description
Early draft
Please be aware that the Data Spaces Blueprint content shared in these pages are a very early draft published on 2023-07-01. The current draft is incomplete and the content might still change.
SAVE-THE-DATE 01-10/09/2023: We will welcome your feedbacks to future improve the Data Spaces Blueprint during the Public consultation that will open on September the 1st 2023 until September the 10th. Please mark these dates in your calendar and get ready!
Overview
The central action performed in data spaces is to exchange data between participants. Additionally, data-driven services are provided. Participants take the roles of providers or consumers. Providers need to describe their data or service offerings to enable consumers to find out what is available for consumption. To enable interoperability and automation, these descriptions are given as structured metadata records whose subject is the data or service offering itself; thus, the latter becomes “self-describing”. Such a self-description addresses multiple or all of the concerns related to the content and context of the data, the concepts addressed, the community of trust, communication about the data, and the commodity nature of the data [1]. Self-descriptions of data and service offerings, but also of their providing participants, end up in → publication and discovery services, where consumers may find or query them. As self-descriptions are created by the providers, it is essential to establish trust in what a provider claims about their offering, e.g., by including into the self-description attestations that independent conformity assessment bodies have made about the artifact described [2].
Key Elements
A self-description should address the following concerns of the data or data-driven service offered. It should include rich information to enhance the comprehensibility of the data [1].
What is the content of the data? This includes:
Data Structure: Description of data objects (entities, tables, records, etc.), their logical groupings and relationships.
Data Format: Format(s) of the data, e.g. FITS, SPSS, HTML, JPEG, and any software required to read the data
Data Semantics: Business definitions, data modeling entities and attributes
What concepts are addressed by the data?
In what context are the data to be interpreted? This includes:
Data Values – What values are allowed, what reference data is available, what patterns or value ranges should it follow, what constraints should it meet, etc.
What community is exchanging the data, and how can its participants be trusted?
What facilities, e.g., API endpoints, enable communication between consumer and provider? This includes:
Access : Where and how data or services can be accessed by the others.
What aspects of a commodity does the data or service offering have, e.g., quality, price, or usage policies? (→ marketplaces and usage accounting) This includes:
Data Creation: Data creator, and time and date of creating data.
Rights: Any known intellectual property rights held for the data.
Data Lineage: What is the data source, how was the data created, derived, and/or calculated, how was it transformed, etc.
Key functions
Self-descriptions cover a broad functional scope of improving the FAIRness (findability, accessibility, interoperability, and reusability) of services, data, and other digital resources in data spaces. In what follows, the key functions of self-description are explained briefly.
Organisation and description
Describes and orders data resources in repositories.
Search and Retrieval
Allowing resources to be identified and found by relevant criteria
Enabling clustering and matching of similar resources
Providing information on where to locate and how to retrieve a resource
Utilisation and Preservation
Helps to track the lifecycle of data resources.
Facilitate Interoperability
A well-established self-description technique (i.e., a metadata model) facilitates interoperability by making it simpler to transfer data and services in data spaces.
Multi-versioning and reuse
Using a self-description approach allows for the maintenance and management of various versions of data or services, which can be utilised in the future.
Dependencies and relationships
As self-descriptions are mere metadata records, they require support from, e.g., → Publication and Discovery services to become operational. By extension, and via their “commodity” concern, they are furthermore related to → Marketplaces and Usage Accounting. Both are fragments of the → Data Value Creation category. Furthermore, since self-descriptions are attached to data and services in a data space, they have explicit or implicit relations with all the other building blocks of the data spaces, including business and legal building blocks. The following explanation depicts the relationship.
Business: A self-description provides business-related information (such the legal name of the organisation offering a service or data in a data space) with every service or dataset offered within the data space ecosystem.
Governance and Legal: The regulatory compliance information can be included in self-describing data and services.
Technology: The critical pieces of information of data and services, including access control parameters, or rights, can be covered in a self-description.
Relevance for the data space
In the context of data spaces, self-description plays a critical role in enabling interoperability, which is a key property outlined by the Data Spaces Business Alliance [3] and acknowledged by all European data space players. As data serves as a fundamental element of a data space, connectors and other crucial instruments must provide self-description in order to allow other participants in the data space to effectively engage.
A comprehensive self-description is necessary to describe an organization, the Service Provider responsible for the Connector, and the data type and content offered or requested. Accurate and valid self-description is integral to establish trust between data spaces participants, given that trust relies heavily on the validity of this information. Additionally, self-description provides reliable data usage policies, which can significantly aid (semi-)automated negotiations that take place between participants of data spaces.
Reference
[1] Reference Architecture Model 4.0 (2022). International Data Space Association (IDSA). Link: https://docs.internationaldataspaces.org/knowledge-base/ids-ram-4.0.
[2] ISO/IEC 17000 - Conformity assessment — Vocabulary and general principles. Link: https://www.iso.org/obp/ui/#iso:std:iso-iec:17000:ed-2:v2:en. For a practical introduction in the context of data spaces, see: Gaia-X European Association for Data and Cloud AISBL (2022). Self-Description of Resources, Service Offerings and Participants within Gaia-X Ecosystems. Decentralised, cryptographically secure, interoperable, extendible, and future proof. https://gaia-x.eu/wp-content/uploads/2022/08/SSI_Self_Description_EN_V3.pdf.
[3] Data Spaces Business Alliance (2023). Technical Convergence Discussion Document. Version 2.0. https://data-spaces-business-alliance.eu/wp-content/uploads/dlm_uploads/Data-Spaces-Business-Alliance-Technical-Convergence-V2.pdf