-
Notifications
You must be signed in to change notification settings - Fork 35
SDV organization
Tim L edited this page Mar 15, 2014
·
63 revisions
SDV organization uses three aspects of a dataset ("source", "dataset", and "version") to organize:
- ... the many datasets that a source may have, and
- ... the many versions that a source may issue for a particular dataset.
Definitions for each of the three aspects:
- Source, the agent (person, organization) providing the dataset.
- Dataset, an abstract portion of all the agent’s data.
- Version, a concrete portion of an agent’s abstract dataset.
The following pages describe the basics of applying "SDV" organization to organize others' datasets.
- Conversion process phase: name
- Conversion process phase: retrieve
- When using the file system to organize data, we use some Directory Conventions.
The following pages describe how SDV organization is used to automatically create new dataset versions.
- Automated creation of a new Versioned Dataset
- Aggregating subsets of converted datasets
- Secondary Derivative Datasets
- Triggers
Other systems may follow the SDV organization, and receive the aspects using parameters.
-
Invoking csv2rdf4lod was the original "SDV situated" process.
- Uses shell variable to determine converter, defaults to
edu.rpi.tw.data.csv.CSVtoRDF(here) - Invokes it with as
$csv2rdfwith a pile of arguments (here)
- Uses shell variable to determine converter, defaults to
- Situating a FAqT Brick into csv2rdf4lod automation
-
Situating a visual strategy into csv2rdf4lod automation
-
visual-artifact-uri=with value from "cr-dataset-uri.sh --uri"
-
- VSR's Content augmentation
-
Situating a data carver session into csv2rdf4lod automation uses the following input arguments
-
--cr-base-uri=http://ieeevis.tw.rpi.edu(== CSV2RDF4LOD_BASE_URI) -
--cr-conversion-root=/Users/me/projects/twc-ieeevis/data/source(--cr-data-rootis a synonym.) --cr-source-id=ieeevis-tw-rpi-edu--cr-dataset-id=data-carves-
--cr-version-id=experiment-1(optional, if omitted the called system should provide a default) - Or,
--cr-source-id,--cr-dataset-id, and--cr-version-idcan be packed into the param: --cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves/version--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves/version/experiment-1--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves/version/experiment-1/source--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves/version/experiment-1/manual--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves/version/experiment-1/automatic--cr-dataset-dir=/Users/me/projects/twc-ieeevis/data/source/ieeevis-tw-rpi-edu/data-carves/version/experiment-1/source/part-1
-
- Accepting SDV params via XSL
<xsl:param name="cr-base-uri" select="'http://my.com'"/><xsl:param name="cr-source-id" select="'epa-gov'"/><xsl:param name="cr-dataset-id" select="'some-dataset'"/><xsl:param name="cr-version-id" select="'latest'"/>
- Passing SDV params to XSL:
cr-sdv.sh --attribute-value
- OPeNDAP += PROV [pingback]
-
Prov.cr_base_uri,Prov.cr_data_root,Prov.cr_source_id,Prov.cr_dataset_id,Prov.cr_dataset_dir - https://github.com/tetherless-world/opendap/wiki/Use-case:-mockup-tracer#wiki-processing-data-from-opendap-using-http
- https://github.com/tetherless-world/opendap/wiki/OPeNDAP-PROV-Module#wiki-configuration
- Has need to specify a version-naming template.
-
-
SemantEco Annotator
- TBD