Skip to content

Querying COCONUT[KG]

COCONUT[KG] allows a variety of queries on a large number of natural products. You can query the COCONUT[KG] online with a SPARQL endpoint.

SPARQL endpoint

At https://coconut-kg.aksw.org/sparql you'll find a SPARQL endpoint. Both back-end and front-end are provided by OpenLink Virtuoso. The back-end serves a SPARQL engine. In the front-end we find a HTTP/SPARQL server with nginx overlay.

NOTE: Before using the SPARQL endpoint we recommend to read this documentation first

SPARQL endpoint details

Endpoint type

The SPARQL enpoint is provided by the OpenLink Software Virtuoso

Rates and limitations

You can make a limited number of connections. The settings can be seen below:

        ResultSetMaxRows           = 25000
        MaxQueryExecutionTime      =   600  (seconds)
        MaxQueryCostEstimationTime =   400  (seconds)
        Connection limit           =    10  (parallel connections per IP address

ATTENTION: The result size is currently limited to 25000 rows. This way partial results are displayed as complete ones and there is no HTTP error.

SPARQL example queries

The following query gives you formula, name, weight and smile of a compound:

PREFIX coco: <http://coconutKG.aksw.org/ontology#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>

SELECT DISTINCT ?formula ?name ?weight ?smile WHERE {
  ?compound a coco:Compound . 
  ?compound coco:molecularFormula ?formula .
  ?compound coco:name ?name .

  ?compound coco:hasDescriptors ?descriptor .
  ?descriptor coco:isMolecular ?mdescriptor .
  ?mdescriptor coco:molecularWeight ?weight .

  ?compound coco:isIdentifiedBy ?unique .
  ?unique coco:smiles ?smile .
} 
LIMIT 10

The following query gives you formula and weight of a compound where the weight is between 320.00 and 320.20:

PREFIX coco: <http://coconutKG.aksw.org/ontology#>
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>

SELECT DISTINCT ?formula ?weight  WHERE {
  ?compound a coco:Compound . 
  ?compound coco:molecularFormula ?formula .

  ?compound coco:hasDescriptors ?descriptor .
  ?descriptor coco:isMolecular ?mdescriptor .
  ?mdescriptor coco:molecularWeight ?weight .

FILTER (?weight > 320.00 && ?weight < 320.20) .
} 
LIMIT 10

It is also possible to query without our query GUI, but with a self written script that connects to our endpoint. Below is a example in python with SPARQLWrapper:

from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper(
    "https://coconut-kg.aksw.org/sparql"
)
sparql.setReturnFormat(JSON)

sparql.setQuery(
    """
    PREFIX coco:  <http://coconutKG.aksw.org/ontology#>
    PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX owl:  <http://www.w3.org/2002/07/owl#>
    PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>
    SELECT DISTINCT ?formula ?name ?weight ?smile WHERE {
        ?compound a coco:Compound .
        ?compound coco:molecularFormula ?formula .
        ?compound coco:name ?name .
        ?compound coco:hasDescriptors ?descriptor .
        ?descriptor coco:isMolecular ?mdescriptor .
        ?mdescriptor coco:molecularWeight ?weight .
        ?compound coco:isIdentifiedBy ?unique .
        ?unique coco:smiles ?smile .}
    LIMIT 10
    """
)

try:
    ret = sparql.queryAndConvert()

    for r in ret["results"]["bindings"]:
        print(r)
except Exception as e:
    print(e)

You can check out a real simple tutorial for SPARQL here: