Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem querying PhySH subject headings using SPARQL: not all rows are returned #149

Open
nloyola opened this issue Nov 24, 2022 · 0 comments
Labels

Comments

@nloyola
Copy link

nloyola commented Nov 24, 2022

Arc2 does not work when querying the PhySH RDF file for disciplines. It only returns 12 of the 18 disciplines.

If you go to the PhySH page, you can see that there are 18 concepts listed under Discipline. PhySH provides an RDF file for download at their GitHub page here:

https://github.com/physh-org/PhySH

I have taken the RDF file and made it available over HTTP here:

http://nloyola.asuscomm.com:8000/physh.rdf

I'm using the following script to query the disciplines:

<?php
require 'vendor/autoload.php';

$options = getopt("ld");

$config = array(
    /* db */
    'db_host' => 'localhost',
    'db_name' => 'physh_rdf',
    'db_user' => 'user',
    'db_pwd' => 'secret',

    /* store name (= table prefix) */
    'store_name' => 'physh_store',
);

$store = ARC2::getStore($config);

if (!$store->isSetUp()) {
    $store->setUp();
}

if (array_key_exists('l', $options)) {
    $store->query('LOAD <http://nloyola.asuscomm.com:8000/physh.rdf>');
}

if (array_key_exists('d', $options)) {
    $store->dump();
    exit(0);
}

function queryCheckError($store, $result) {
    if ($store->getErrors()) {
        print_r($store->getErrors());
        exit(0);
    }
    return $result;
}

function getDisciplines($store) {
    $q = '
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX physh: <https://doi.org/10.29172/>
PREFIX physh_rdf: <https://physh.org/rdf/2018/01/01/core#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *
WHERE {
   ?s ?p physh_rdf:Discipline .
   ?s dcterms:title ?title .
   #?s physh_rdf:prefLabel ?label .
   #?s dcterms:description ?description .
}
';

    return queryCheckError($store, $store->query($q));
}

function showResult($result) {
    $rows = $result['result']['rows'];
    $numRows = count($rows);
    print("rows: {$numRows}\n");

    print(json_encode($rows, JSON_PRETTY_PRINT) . "\n");

    //print_r($result);
    // foreach ($rows as $k => $v) {
    //     print($k . ": " . json_encode($v, JSON_PRETTY_PRINT) . "\n");
    // }
}

$result = getDisciplines($store);
showResult($result);

When I run this script, it only returns 12 rows.

If I use Rasqal RDF Query Library with the same SPARQL query, 19 rows are returned. The extra row corresponds to the root entry I believe.

Note that I'm using MariaDB as my database server. Using PHP 8.1.12 running on Debian 11.

Any help with this issue is greatly appreciated.

@k00ni k00ni added the bug label Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants