-
Notifications
You must be signed in to change notification settings - Fork 35
Ermilov's wiki.publicdata.eu CSV2RDF Application
Ermilov et al. presented a wiki-based approach to crowd-sourcing the enhancements of ~9k datasets listed at http://publicdata.eu (WebSci 2012 paper).
A year after its publication, how far has the crowd-sourcing come?
This pages provides a summary and review of Ermilov's wiki.publicdata.eu CSV2RDF Application.
Four accounts contributed, and the two non-author accounts provided fewer than ten contributions.
find manual/pages -name "*.ttl" | xargs -L1 grep "wasAttributedTo" | sort -u shows only a handful of contributors:
prov:wasAttributedTo <http://wiki.publicdata.eu/wiki/User:178.25.43.32>;
prov:wasAttributedTo <http://wiki.publicdata.eu/wiki/User:2001:638:902:2010:0:168:35:101>;
prov:wasAttributedTo <http://wiki.publicdata.eu/wiki/User:Iermilov>;
prov:wasAttributedTo <http://wiki.publicdata.eu/wiki/User:IvanErmilov>;
prov:wasAttributedTo <http://wiki.publicdata.eu/wiki/User:Soeren>;
Fifteen terms were reused from nine vocabularies for more than 9,000 datasets. We skip the three non-CURIEs listed below because it is not clear that they are RDF terms.
find manual/pages -name "*.xml.ttl" | xargs -L1 grep "conversion:label" | sed 's/conversion:label//' | grep : | sed 's/^ *"/"/' | grep -v " " | sort -u:
"cgov:fullTimeEquivalentSalary";
"cgov:lowerBound";
"cgov:upperBound";
"dce:date";
"foaf:mbox";
"foaf:name";
"foaf:phone";
"http://dbpedia.org/resource/Category:Ministerial_departments_of_the_United_Kingdom_Government";
"http://statistics.data.gov.uk/id/local-authority/32UC";
"http://www.google.co.uk";
"org:OrganizationalUnit";
"org:organization";
"org:unitOf";
"pc:supplier";
"rdf:type";
"rdfs:comment";
"skos:Amount";
"whois:Job";
- Enables community-editable mappings using an existing mechanism (wikimedia).
- User-invokable reconversion.
Usability:
- The wiki-page is hard to use because it is disconnected from the original and resulting data.
- The community hasn't used the tool, even though it has been available for use for a year.
Linked Data Best Practices:
- The mappings are expressed in RDF; they are only expressed mediawiki template arguments (and sparqlify behind the scenes, but they aren't available for public inspection).
- The mappings are not described with RDF, since it's just a wiki page. They do not refer back to the dataset that they enhance, and they do not refer to the resulting RDF conversion.
Mapping capabilities:
- It can't specify a datatype for a cell's value like conversion:range does (e.g. ""85" is an xsd:integer).
- It can't "promote" a cell value to a URI like conversion:range does (e.g. "http://www.google.co.nz" becomes http://www.google.co.nz).
- It can't type a URI to a given class like conversion:range_template/conversion:subclass_of do (e.g. http://www.google.co.nz is a sioc:Space).