-
Notifications
You must be signed in to change notification settings - Fork 7
Data Quality
Tim L edited this page Jun 19, 2013
·
40 revisions
Quality data...
- ... is structured similarly to dataset X using uniform vocabulary.
- ... is structured similarly to dataset X.
- ... I [dis]agree with.
- ... I understand.
- ... is complete.
- ... explicitly connects to the data currently portrayed in visual artifact X. (e.g. A book's two pages, currently visible)
- ... explicitly connects to the data portrayed in visual artifact X. (e.g. An entire book, yet to be opened)
- ... explicitly connects to dataset X.
- ... I find interesting.
- ... explicitly connects to other datasets. (i.e. TBL-5)
- ... is in RDF that I can retrieve as a dump.
- ... is in RDF that I can retrieve via SPARQL query.
- ... is in RDF that I can retrieve with dereferencable URIs.
- ... is in RDF that I can retrieve.
- ... is in RDF. (i.e. TBL-4)
- ... I can retrieve and is machine processable using my own (or open) tools. (i.e. TBL-3)
- ... I can retrieve and is machine processable. (i.e. TBL-2)
- ... I may and can retrieve. (i.e. TBL-1)
- ... I may retrieve. (i.e. open)
situate:
- uses vocabulary annotated with vocabulary annotations
- Number of triples for each dataset
- Number of datasets
- Number of interlinks [from each dataset [to each other dataset]]
- Accessible via data dump, accessible via SPARQL query, accessible via crawling
- For each property in a dataset, the number of extra-namespace URI values that are dereferencable
- Density: Number of extra-namespace URI values in a dataset / The size of the dataset.
- Density: Number of extra-namespace URI values in a dataset / Number of instances of a given class in the dataset.
Tummarello 2007
- Distribution of URIs over documents
Ding 2005
- Distribution of URIs over documents
- Interlinking
Wang 2006
- schema level gauges