Skip to content

total_terms_in_collection != sum of doclengths in robust04 queries only #21

@cmacdonald

Description

@cmacdonald

For Robust04 queries only, the sum of the doclens is 167686911, while total_terms_in_collection=174540872 in the ciff file. Why is it more? This affect the avgdoclength, and hence the BM25 scores

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions