Open
Description
It seems to me the following code snippet doesn't work as expected:
I was thinking filtering out duplicate relations means for those exactly repeated relation triplets (i.e., not only subject and object are the same but also the predicate); however, this snippet seems to preserve only a single predicate for each object pair (with a higher chance for those occurring more times to be chosen). This seems unreasonable for me and makes the following snippet redundant:
To accommodate multiple labels for each object pair, I think we have to change L148-L156 to the following:
if self.filter_duplicate_rels:
# Filter out dupes!
assert self.split == 'train'
old_size = relation.shape[0]
all_rel_sets = defaultdict(set)
for (o0, o1, r) in relation:
all_rel_sets[(o0, o1)].add(r)
relation = [(k[0], k[1], v) for k, vs in all_rel_sets.items() for v in vs]
relation = np.array(relation, dtype=np.int32)
Metadata
Metadata
Assignees
Labels
No labels