Skip to content

One click data dump

timrdf edited this page Jan 7, 2013 · 35 revisions

What is first

What we will cover

Let's get to it!

cr-full-dump.sh gathers all versioned dataset dump files into a single gzipped ntriples file that contains all RDF data in a csv2rdf4lod node. The URL of the "one-click data download" can (will) be found in the VoID description of the csv2rdf4lod node (e.g. http://healthdata.tw.rpi.edu/void.ttl). (TODO: the file is created from cron, but it isn't published and isn't mentioned in the void file yet).

Produced by aggregate-source-rdf.sh:

<http://purl.org/twc/health/void>
   void:subset <http://purl.org/twc/health/source/healthdata-tw-rpi-edu/dataset/cr-full-dump/version/latest> .

<http://purl.org/twc/health/source/healthdata-tw-rpi-edu/dataset/cr-full-dump/version/latest>
   a void:Dataset;
   void:dataDump <SOME_FILE>
.

We want to be sneaky and include, as part of the dataset, another dataDump and void:inDataset references from every URI to the dump dataset:

<http://purl.org/twc/health/void>
   void:dataDump <SOME_FILE>
.
<cowboy>        void:inDataset <http://purl.org/twc/health/source/healthdata-tw-rpi-edu/dataset/cr-full-dump> .
dbpedia:Montana void:inDataset <http://purl.org/twc/health/source/healthdata-tw-rpi-edu/dataset/cr-full-dump> .

What is next

Clone this wiki locally