Clarification on data format #171
cj2001
started this conversation in
WikiKG90M-LSC
Replies: 1 comment 1 reply
-
Hi, the numbers in train_hrt are NOT Wikidata identifiers, but rather the node id/relation id in the KG. For example, the first row [0 1182 650146] represents there is the relation 1182 between node 0 and node 650146. And these idx ranges from [0, num_node) and [0, num_relation) on the KG. We currently do not provide the raw text, but will probably release them in the future (as in #147 (reply in thread)). Thank you. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a quick question on the data format. The docs show:
The docs say "Each triplet (head, relation, tail) in WikiKG90M-LSC represents an Wikidata claim, where head and tail are the Wikidata items, and relation is the Wikidata predicate." So I took away from that that these values are Wikidata identifiers (example
Q42
is Douglas Adams). However, if I look at what the format of these triples areprint(train_hrs[0:10])
what I get isIf I take the head of that first triples, I had assumed this might be something like
Q0
, howeverQ0
does not exist in Wikidata.So is there a way to map these integers to the text in Wikidata?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions