-
Notifications
You must be signed in to change notification settings - Fork 19
Standardize CVR identification across audit logs and RCTab output #913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Feature Request: Add Standardized Unique Identifiers for Cast Vote Records #911
Replace UUID-based identification with a more consistent ID system that uses the same identifier across audit logs and RCTab CVR output. The ID is primarily derived from the computedId (tabulator-batch-record) with fallback to suppliedId when computedId is unavailable. Unlike UUIDs which change between tabulations, this approach ensures the same ballot maintains the same identifier across multiple tabulations, making it easier to track specific ballots across different runs. Changes: - Remove UUID generation for CVR identification - Add getPrimaryId() method to centralize ID logic - Update audit log and RCTab CVR to use the same ID format - Change ID delimiter from pipe to dash for better readability - Add TODO to review legacy getId() usage Note: This change ensures that CVR IDs are consistent between audit logs, RCTab CVR output, and repeated tabulations, making it easier to track individual ballots across different output formats and multiple runs.
…ture/issue-911-rctab-cvr-primary-id
...ces/network/brightspots/rcv/test_data/conversions_from_ess/conversions_from_ess_expected.csv
Show resolved
Hide resolved
…ture/issue-911-rctab-cvr-primary-id
null. Null handling is now handled where getSuppliedId is called.
| Stream.of(tabulatorId, batchId, Integer.toString(recordId)) | ||
| .filter(s -> s != null && !s.isBlank()) | ||
| .collect(Collectors.joining("|")); | ||
| .collect(Collectors.joining("-")); // using a dash since this is an Id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand why - vs | is important, but the comment makes me think it matters? Is it because it will be used in filenames? If so, perhaps that should be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an aesthetic choice. Originally I had used pipes as they aren't really used in the wild (certainly less so than -), and I wanted to signal that it was computed. I'm going to revert to pipe to signal that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we discovered another reason why pipes won't work, but I forget what it was. Can we change the comment to be the reason you discovered?
6669937 to
d15c79d
Compare
artoonie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some [nit]s but this looks good
| Stream.of(tabulatorId, batchId, Integer.toString(recordId)) | ||
| .filter(s -> s != null && !s.isBlank()) | ||
| .collect(Collectors.joining("|")); | ||
| .collect(Collectors.joining("-")); // using a dash since this is an Id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we discovered another reason why pipes won't work, but I forget what it was. Can we change the comment to be the reason you discovered?
Key Changes:
CastVoteRecordidentifier logicgetId- guaranteed unique identifier per ballot across tabulations. coalesce(computedId,suppliedId)getSuppliedId- vendor provided Id. Not reliably unique. NullablegetIdthroughout audit log andrctab_cvr.csvrctab_cvr.csvnow explicitly includes both.getIdprovides the"RCTab CVR Id"andgetSuppliedIdprovides the"Vendor Id"(formerly"Ballot Id")computedIddelimiter from pipe (|) to dash (-) for improved readabilityTesting Needed:
rctab_cvr.csvoutputgetIdmethod is consistent across multiple tabulations