-
Notifications
You must be signed in to change notification settings - Fork 35
Ping the Semantic Web
- It is natural to want to announce any new datasets that you Conversion process phase: publish.
This page describes http://pingthesemanticweb.com and http://sindice.com/developers/pingApi, and how to use them. These two services accept pointers to Linked Data URIs or files. They index them, so it's easier for others to find your data. Sindice seems to be the leader in this capability.
There are two platforms that accept announcements of semantic web data: Ping the Semantic Web and Sindice.
$CSV2RDF4LOD_HOME/bin/util/ptsw.sh returns the URLs required to notify Ping the Semantic Web about the RDF documents listed.
Must be done from a machine whose IP is registered at http://pingthesemanticweb.com
http://homepages.rpi.edu/~lebot/lod-links/state-fips-dbpedia.ttl
curl http://pingthesemanticweb.com/rest/?url=http%3A%2F%2Fhomepages.rpi.edu%2F~lebot%2Flod-links%2Fstate-fips-dbpedia.ttl
<response>
<message>Thanks for pinging Ping the Semantic Web.</message>
<flerror>0</flerror>
</response>
http://homepages.rpi.edu/~lebot/lebot.foaf
curl http://pingthesemanticweb.com/rest/?url=http%3A%2F%2Fhomepages.rpi.edu%2F~lebot%2Flebot.foaf
<response>
<message>Thanks for pinging Ping the Semantic Web.</message>
<flerror>0</flerror>
</response>
Would not work on:
- https://raw.github.com/timrdf/csv2rdf4lod-automation/master/doc/instances/person/lebot.foaf (scheme? domain? it's the same file as above.)
- http://logd.tw.rpi.edu/source/twc-rpi-edu/file/iogds/version/2011-Nov-15/conversion/twc-rpi-edu-iogds-2011-Nov-15.void.ttl
<response>
<message>Ping the Semantic Web is not allowed to index this URL.</message>
<flerror>1</flerror>
</response>
http://sindice.com/developers/pingApi
curl -H "Accept: text/plain" --data-binary http://healthdata.tw.rpi.edu/source/healthdata-tw-rpi-edu/dataset/cr-linksets/version/2013-Jan-08 http://api.sindice.com/v2/ping
1 pings submitted, 1 accepted
See what they index in the last week:
- http://sindice.com/search?q=date:last_week+domain:purl.org/twc/health
- http://sindice.com/search?q=date:last_week+domain:healthdata.tw.rpi.edu
- http://sindice.com/search?q=domain%3Aieeevis.tw.rpi.edu&nq=&fq=date%3Alast_week
http://sindice.com/developers/publishing
cr-pingback.sh can be run from a csv2rdf4lod [data root](csv2rdf4lod automation data root) or a specific source organization directory (e.g. data/source, data/source/us, respectively) (it is also run by cr-cron.sh). It dereferences the "/void" path of the data domain (e.g. http://opendap.tw.rpi.edu/void) and POSTs it to update datahub.io's dataset record. The [environment variable](CSV2RDF4LOD environment variables) CSV2RDF4LOD_PUBLISH_DATAHUB_METADATA_OUR_BUBBLE_ID determines which dataset on datahub.io will be updated; e.g. the value twc-ieeevis will cause the dataset http://datahub.io/dataset/twc-ieeevis to be updated. The DataFAQs script add-metadata.py is used to process the VoID description into the JSON structure that CKAN requires, and suits the additional requirements to be listed in the lodcloud group.
- csv2rdf4lod-automation does not announce new datasets by default; it must be setting the CSV2RDF4LOD environment variables
CSV2RDF4LOD_PUBLISH_ANNOUNCE_TO_SINDICE=trueand/orCSV2RDF4LOD_PUBLISH_ANNOUNCE_TO_PTSW=true.