Persistence
The IRS stores two types of data in a persistent way:
-
Job metadata
-
Job payloads, e.g. AAS shells or submodel data
All of this is data is stored in an object store. The currently used implementation is Minio (Amazon S3 compatible). This reduces the complexity in storing and retrieving data. There also is no predefined model for the data, every document can be stored as it is. The downside of this approach is lack of query functionality, as we can only search through the keys of the entries but not based on the value data. In the future, another approach or an additional way to to index the data might be required.
To let the data survive system restarts, Minio needs to use a persistent volume for the data storage. A default configuration for this is provided in the Helm charts.
Transaction handling
There currently is no transaction management in the IRS.
Session handling
There is no session handling in the IRS, access is solely based on bearer tokens, the API is stateless.
Communication and integration
All interfaces to other systems are using RESTful calls over HTTP(S). Where central authentication is required, a common OAuth2 provider is used.
For outgoing calls, the Spring RestTemplate mechanism is used and separate RestTemplates are created for the different ways of authentication.
For incoming calls, we utilize the Spring REST Controller mechanism, annotating the interfaces accordingly and also documenting the endpoints using OpenAPI annotations.
Exception and error handling
There are two types of potential errors in the IRS:
Technical errors
Technical errors occur when there is a problem with the application itself, its configuration or directly connected infrastructure, e.g. the Minio persistence. Usually, the application cannot solve these problems by itself and requires some external support (manual work or automated recovery mechanisms, e.g. Kubernetes liveness probes).
These errors are printed mainly to the application log and are relevant for the healthchecks.
Functional errors
Functional errors occur when there is a problem with the data that is being processed or external systems are unavailable and data cannot be sent / fetched as required for the process. While the system might not be able to provide the required function at that moment, it may work with a different dataset or as soon as the external systems recover.
These errors are reported in the Job response and do not directly affect application health.
Rules for exception handling
Throw or log, don’t do both
When catching an exception, either log the exception and handle the problem or rethrow it, so it can be handled at a higher level of the code. By doing both, an exception might be written to the log multiple times, which can be confusing.
Write own base exceptions for (internal) interfaces
By defining a common (checked) base exception for an interface, the caller is forced to handle potential errors, but can keep the logic simple. On the other hand, you still have the possibility to derive various, meaningful exceptions for different error cases, which can then be thrown via the API.
Of course, when using only RuntimeExceptions, this is not necessary - but those can be overlooked quite easily, so be careful there.
Central fallback exception handler
There will always be some exception that cannot be handled inside of the code correctly - or it may just have been unforeseen. A central fallback exception handler is required so all problems are visible in the log and the API always returns meaningful responses. In some cases, this is as simple as a HTTP 500.
Dont expose too much exception details over API
It’s good to inform the user, why their request did not work, but only if they can do something about it (HTTP 4xx). So in case of application problems, you should not expose details of the problem to the caller. This way, we avoid opening potential attack vectors.
Parallelization and threading
The heart of the IRS is the parallel execution of planned jobs. As almost each job requires multiple calls to various endpoints, those are done in parallel as well to reduce the total execution time for each job.
Tasks execution is orchestrated by the JobOrchestrator class. It utilizes a central ExecutorService, which manages the number of threads and schedules new Task as they come in.
Plausibility checks and validation
Data validation happens at two points:
-
IRS API: the data sent by the client is validated to match the model defined in the IRS. If the validation fails, the IRS sends a HTTP 400 response and indicates the problem to the caller.
-
Submodel payload: each time a submodel payload is requested from via EDC, the data is validated against the model defined in the SemanticHub for the matching aspect type.
-
EDC Contract Offer Policy: each time IRS consumes data over the EDC, the policies of the offered contract will be validated. IDs of so-called "Rahmenverträgen" or Framework-Agreements can be added to the IRS Policy Store to be accepted by the IRS. If a Contract Offer does not match any of the IDs store in Policy Store, the contract offer will be declined and no data will be consumed.
Policy Store
The IRS gives its users the ability to manage, create and delete complex policies containing permissions and constraints in order to obtain the most precise control over access and use of data received from the edc provider. Policies stored in Policy Store will serve as input with allowed restriction and will be checked against every item from EDC Catalog.
The structure of a Policy that can be stored in storage can be easily viewed by using Policy Store endpoints in the published API documentation. Each policy may contain more than one permission, which in turn consists of constraints linked together by AND or OR relationships. This model provides full flexibility and control over stored access and use policies.
Digital Twin / EDC requirements
In order to work with the decentral network approach, IRS requires the Digital Twin to contain a "subprotocolBody"
in each of the submodelDescriptor endpoints.
This "subprotocolBody"
has to contain the "id"
of the EDC asset, as well as the "dspEndpoint"
of the EDC, separated by a semicolon (e.g. "subprotocolBody": "id=123;dspEndpoint=http://edc.control.plane/api/v1/dsp"
).
The "dspEndpoint"
is used to request the EDC catalog of the dataprovider and the "id"
to filter for the exact asset inside this catalog.
If the "dspEndpoint"
is not present, every available EDC endpoint in DiscoveryService will be queried until a asset with the "id"
can be found.
Caching
The IRS caches data provided externally to avoid unnecessary requests and reduce execution time.
Caching is implemented for the following services:
Semantics Hub
Whenever a semantic model schema is requested from the Semantic Hub, it is stored locally until the cache is evicted (configurable). The IRS can preload configured schema models on startup to reduce on demand call times.
Additionally, models can be deployed with the system as a backup to the real Semantic Hub service.
Discovery Service
The IRS uses the Discovery Finder in order to find the correct EDC Discovery URL for the type BPN. This URL is cached locally.
When the EDC Discovery is requested to return the EDC connector endpoint URLs for a specific BPN, the results are cached as well.
The time to live for both caches can be configured separately as described in the Administration Guide.
Further information on Discovery Service can be found in the chapter "System scope and context".
Discovery Process
Digital Twin Registry
The Dataspace Discovery Service handles multiple EDC-Urls received for BPN. This applies to the following scenarios.
Scenario 1: One EDC with multiple DTRs
IRS queries all DTRs for the globalAssetId and will take the first result it gets. If none of the DTRs return a result, IRS will create a tombstone.
Same diagram with a little more detail on IRS side:
Scenario 2: Multiple EDCs with one DTR
IRS starts a contract negotiation for all registry contract offers in parallel and queries the DTRs for all successful negotiations. The first registry which responds with a DT will be the one used by IRS.
Same diagram with a little more detail on IRS side:
Scenario 3: One EDC with one DTR
Only one EDC found for BPN and the catalog only contains one offer for the DTR. IRS will use this registry and will create a tombstone if no DT could be found for the globalAssetId.
Same diagram with a little more detail on IRS side:
Scenario 4: Multiple EDCs with multiple DTRs
IRS starts a contract negotiation for all the registry offers.
Same diagram with a little more detail on IRS side:
Scenario 5: Multiple EDCs with no DTRs
IRS starts a contract negotiation for all the registry offers and creates a tombstone since no DTR could be discovered.
Same diagram with a little more detail on IRS side:
Special Scenario: Same DT in multiple DTRs
IRS will use all registries to query for the globalAssetId and takes the first result which is returned. If no DT could be found in any of the DTRs, IRS will create a tombstone.
Special Scenario: Multiple DTs (with the same globalAssetId) in one DTR
IRS uses the /query
endpoint of the DTR to get the DT id based on the globalAssetId.
If more than one id is present for a globalAssetId, IRS will use the first of the list.
EDC
EndpointDataReferenceStorage is in-memory local storage that holds records (EndpointDataReferences) by either assetId or contractAgreementId.
When EDC gets EndpointDataReference describing endpoint serving data it uses EndpointDataReferenceStorage and query it by assetId. This allows reuse of already existing EndpointDataReference if it is present, valid, and it’s token is not expired, rather than starting whole new contract negotiation process.
In case token is expired the process is also shortened. We don’t have to start new contract negotiation process, since we can obtain required contractAgreementId from present authCode. This improves request processing time.