OP-Team: Data Quality Management report (WIP)

Requirements for reports to check the data quality on the production environment.

Data quality (source)

The 6 main criteria used to measure data quality are:

  • Accuracy: for whatever data described, the data needs to be accurate.

  • Uniqueness: Is supposedly unique data, actually unique and thus not duplicated?

  • Relevancy: the data should meet the requirements for the intended use.

  • Completeness: the data should not have missing values or miss data records.

  • Timeliness: the data should be up to date.

  • Consistency: the data should have the data format as expected and can be cross referenceable with the same results.

Possible actions:

Check whether we do everything to ensure the new data users add into OP has good quality.

  • Have we put enough validations in place to eliminate formatting issues/human error (e.g. non-existent addresses, phone numbers missing digits,...) - Uniqueness, Accuracy & Consistency

  • Have we put enough restrictions in place to make sure users fill in all the relevant data that needs to be filled in? - Completeness

  • Do we have working agreements in place to ensure users help us keep the data quality high? - Relevancy, Timeliness

Queries created in Loket

Always check if we can copy existing queries from Loket.

NrTITLEDESCRIPTION

1

Bestuurseenheden zonder notificatie emailadres in Berichtencentrum

Report listing all bestuurseenheden without notification emailaddress in Berichtencentrum

2

List mandatarissen having no person linked

Mandatarissen having no person linked

3

List persons having two different first names

Persons with their first name and last name

4

List mandatarissen having no start date

Mandatarissen with their first name and last name, role and bestuurseenheid

5

List mandatarissen linked to an empty person

Mandatarissen and the uris of the empty persons

6

List mandatarissen having two end dates

Mandatarissen with their end dates, first name and last name

7

List mandatarissen having two start dates

Mandatarissen with their start dates, first name and last name

Scope of administrative units

  • municipality (gemeente)

  • OCMW

  • district

  • province (provincie)

Queries created in tickets

OP-2321: bestuursorganen

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
  ?s a <http://data.vlaanderen.be/ns/besluit#Bestuursorgaan> ;
    <http://www.w3.org/2004/02/skos/core#prefLabel> ?label .
}

OP-1760: The query why returns any literal starting with an comma (takes a few seconds):

SELECT ?s ?p ?literal
WHERE {
  ?s ?p ?literal .
  FILTER (isLiteral(?literal) && strstarts(str(?literal), ","))
}

OP-2515: the query to go from the label Gent to it’s region.

PREFIX adms: <http://www.w3.org/ns/adms#>
PREFIX adres: <https://data.vlaanderen.be/ns/adres#>
PREFIX besluit: <http://data.vlaanderen.be/ns/besluit#>
PREFIX ch: <http://data.lblod.info/vocabularies/contacthub/>    
PREFIX code: <http://lblod.data.gift/vocabularies/organisatie/>
PREFIX dbpedia: <http://dbpedia.org/ontology/>
PREFIX dc_terms: <http://purl.org/dc/terms/>
PREFIX ere: <http://data.lblod.info/vocabularies/erediensten/>
PREFIX euro: <http://data.europa.eu/m8g/>
PREFIX euvoc: <http://publications.europa.eu/ontology/euvoc#>
PREFIX ext: <http://mu.semte.ch/vocabularies/ext/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX generiek: <https://data.vlaanderen.be/ns/generiek#>
PREFIX lblodlg: <https://data.lblod.info/vocabularies/leidinggevenden/>
PREFIX locn: <http://www.w3.org/ns/locn#>
PREFIX mandaat: <http://data.vlaanderen.be/ns/mandaat#>
PREFIX org: <http://www.w3.org/ns/org#>
PREFIX organisatie: <https://data.vlaanderen.be/ns/organisatie#>
PREFIX person: <http://www.w3.org/ns/person#>
PREFIX persoon: <https://data.vlaanderen.be/ns/persoon#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX regorg: <http://www.w3.org/ns/regorg#>
PREFIX schema: <http://schema.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>


SELECT distinct * {

?adminUnit skos:prefLabel "Gent";
   besluit:werkingsgebied ?location.

?location geo:sfWithin ?region.
?region 
   ext:werkingsgebiedNiveau "Referentieregio"@nl;
   rdfs:label ?regionLabel.

} LIMIT 50

Queries to be created

Overview is on SharePoint: DataQualityManagementReport_queries.xls, sheet 'Queries'

Business Rules to be discussed

Overview is on SharePoint: DataQualityManagementReport_queries.xls, sheet 'To discuss'

Last updated