Technical analysis: Sync with deletes + transformation
https://binnenland.atlassian.net/browse/OP-2869 identifies a bug in the current sync mechanism.
In some cases, a delete in the source system leads to too many deletes.
Deltas arrive in small chunks without any context.
To process these deltas (eventually) - the consumer attempts to retrieve context such as the datatype in order to eventually process the incoming changes
After converting the incoming delta + context, the resulting transformation needs to be filtered again, in order to not delete too many triples.
e.g. when the
rdf:type
of a subject was added as context, this statement should not be part of theDELETE
query.
Mitigation:
Removing the context triples from the transformed graph mitigates the issue.
There are potential problems with this approach which should be taken into consideration when changing the conversions rules:
Deducing rdf:type
triples in the rules should be avoided. e.g:
e.g.:
different rules deriving an `rdf:type` based on a property:
can lead to deletions of the
:subject a ex:SomeClass
triple when one triple with a matching property is deleted in the source system.
Solution
Some possibilities:
Keep current approach, keeping mitigations in mind
Add more complex filtereing logic in consumer code
potentially very complex logic which duplicates a lot of the reasoning logic
Do all mappings in JS code
Logic in one place
might be very slow, especially for an initial sync
Keep reasoner for initial sync?
Last updated