Skip to content

Ping the Semantic Web

Tim L edited this page Jan 14, 2014 · 32 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)

What is first

What we will cover

This page describes http://pingthesemanticweb.com and http://sindice.com/developers/pingApi, and how to use them. These two services accept pointers to Linked Data URIs or files. They index them, so it's easier for others to find your data. Sindice seems to be the leader in this capability.

Let's get to it!

There are two platforms that accept announcements of semantic web data: Ping the Semantic Web and Sindice.

Platform 1 of 2: Ping the Semantic Web

$CSV2RDF4LOD_HOME/bin/util/ptsw.sh returns the URLs required to notify Ping the Semantic Web about the RDF documents listed.

Must be done from a machine whose IP is registered at http://pingthesemanticweb.com

Success 1

http://homepages.rpi.edu/~lebot/lod-links/state-fips-dbpedia.ttl

curl http://pingthesemanticweb.com/rest/?url=http%3A%2F%2Fhomepages.rpi.edu%2F~lebot%2Flod-links%2Fstate-fips-dbpedia.ttl
<response>
	<message>Thanks for pinging Ping the Semantic Web.</message>
	<flerror>0</flerror>
</response>

Success 2

http://homepages.rpi.edu/~lebot/lebot.foaf

curl http://pingthesemanticweb.com/rest/?url=http%3A%2F%2Fhomepages.rpi.edu%2F~lebot%2Flebot.foaf
<response>
	<message>Thanks for pinging Ping the Semantic Web.</message>
	<flerror>0</flerror>
</response>

Failures

Would not work on:

<response>
	<message>Ping the Semantic Web is not allowed to index this URL.</message>
	<flerror>1</flerror>
</response>

Platform 2 of 2: Sindice

http://sindice.com/developers/pingApi

curl -H "Accept: text/plain" --data-binary http://healthdata.tw.rpi.edu/source/healthdata-tw-rpi-edu/dataset/cr-linksets/version/2013-Jan-08 http://api.sindice.com/v2/ping

1 pings submitted, 1 accepted

See what they index in the last week:

http://sindice.com/developers/publishing

cr-pingback

cr-pingback.sh can be run from a csv2rdf4lod data root or a specific source organization directory (e.g. data/source, data/source/us, respectively). It dereferences the "/void" path of the data domain (e.g. http://opendap.tw.rpi.edu/void) and POSTs it to update datahub.io's dataset, which is determined by the [environment variable](CSV2RDF4LOD environment variables) CSV2RDF4LOD_PUBLISH_DATAHUB_METADATA_OUR_BUBBLE_ID (e.g. twc-ieeevis for http://datahub.io/dataset/twc-ieeevis).

What is next

  • csv2rdf4lod-automation does not announce new datasets by default; it must be setting the CSV2RDF4LOD environment variables CSV2RDF4LOD_PUBLISH_ANNOUNCE_TO_SINDICE=true and/or CSV2RDF4LOD_PUBLISH_ANNOUNCE_TO_PTSW=true.

Clone this wiki locally