-
The instruction for training ColBERT says that it requires a JSONL triples file with a [qid, pid+, pid-] list per line, for example triples="/path/to/MSMARCO/triples.train.small.tsv". But triples.train.small.tsv contains question, answer and passage. Which format should I use? |
Beta Was this translation helpful? Give feedback.
Answered by
okhat
Jan 5, 2023
Replies: 2 comments
-
What is the training data format? The introduction is not clear |
Beta Was this translation helpful? Give feedback.
0 replies
-
Both formats work. I suggest sticking to a jsonl format |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
okhat
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Both formats work. I suggest sticking to a jsonl format