Open
Description
Once we have the CSV, we should check if any of the grants are obviously duplicates.
Command: npm run check knight
To start, we can just grab the list of all grants made by the funder (since we know that ID is good) -- that way we don't have to depend on #1.
Heuristics:
- Same grant amount as existing grant
- Same grant start and/or end date as an existing grant
- Same / similar name (maybe use a textual similarity function)
If a match is found, we should add a duplicate
column, maybe with a score (0, 1, 2, something like that) and a link to the potential duplicate(s)
Metadata
Assignees
Labels
No labels