Skip to content

Script: pcurl.py

timrdf edited this page Jan 3, 2012 · 46 revisions

$CSV2RDF4LOD_HOME/bin/util/pcurl.py is Jim McCusker's reimplemention of pcurl.sh to include FRBR stacks and HTTP-in-RDF. He has included it as part of csv2rdf4lod-automation. Applications of this utility are described in the following publications:

Usage

bash-3.2$ pcurl.py --help
usage: pcurl.py [--help|-h] [--format|-f xml|turtle|n3|nt] [url ...]

Download a URL and compute Functional Requirements for Bibliographic Resources
(FRBR) stacks using cryptograhic digests for the resulting content.

Refer to http://purl.org/twc/pub/mccusker2012parallel
for more information and examples.

optional arguments:
 url            url to compute a FRBR stack for.
 -h, --help     Show this help message and exit,
 -f, --format   File format for FRBR stacks. One of xml, turtle, n3, or nt.

Example

The following command will retrieve this web page and store it to a file in your current directory.

bash-3.2$ pcurl.py https://github.com/timrdf/csv2rdf4lod-automation/wiki/Script:-pcurl.py

The script will include a second file describing the provenance of the retrieved file:

bash-3.2$ ls
Script:-pcurl.py		Script:-pcurl.py.prov.ttl

If something happens to the file you retrieved (e.g., a file rename or move), the $CSV2RDF4LOD_HOME//bin/util/fstack.py can be used to recognize an association between the downloaded file and the one we see now:

bash-3.2$ mv Script\:-pcurl.py script-pcurl.html
bash-3.2$ fstack.py script-pcurl.html
bash-3.2$ ls script-pcurl.html*
script-pcurl.html		script-pcurl.html.prov.ttl

Clone this wiki locally