Skip to content

Conversation

@MoralCode
Copy link
Contributor

@MoralCode MoralCode commented Oct 29, 2025

Description
A collaborative change built with the other augur maintainers (@sgoggins and @ABrain7710) to add the coalesce postgres function to our bulk insert methods to ensure that values in the database cannot be overwritten with NULL values. This was inspired by @razekmh's solution in #3342

This was implemented without a parameter to toggle this functionality off because we couldnt think of a scenario where one might want to intentionally overwrite data with null in augur

This PR fixes #3317

This supersedes #3342

Notes for Reviewers
should be all set to merge once the OP of the linked issue confirms this still fixes the issue.

Signed commits

  • Yes, I signed my commits.




def bulk_insert_dicts(logger, data: Union[List[dict], dict], table, natural_keys: List[str], return_columns: Optional[List[str]] = None, string_fields: Optional[List[str]] = None, on_conflict_update:bool = True) -> Optional[List[dict]]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name)

@sgoggins sgoggins self-assigned this Oct 29, 2025
@sgoggins sgoggins added discussion Seeking active feedback, usually for items under active development bug-fix Fixes a bug labels Oct 29, 2025
@MoralCode MoralCode force-pushed the null_cntrb_email branch 2 times, most recently from 436b6d3 to f4f0501 Compare October 29, 2025 16:53



def bulk_insert_dicts(logger, data_input: Union[List[dict], dict], table, natural_keys: List[str], return_columns: Optional[List[str]] = None, string_fields: Optional[List[str]] = None, on_conflict_update:bool = True) -> Optional[List[dict]]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name)

@giordano
Copy link
Contributor

I did a quick smoke test and it looks like this PR fixes our problem with some commits, like UCL/rsd-engineeringcourse@fffaf9a, not being attributed to any user, as it's currently happening. SELECT c.* FROM augur_data.contributors AS c WHERE cntrb_email IS NOT NULL is quickly filled up and doesn't decrease over time anymore. Thank you!

@MoralCode
Copy link
Contributor Author

Sweet! Since I made modifications since we wrote this and it impacts a critical part of the code, I'd like to get @ABrain7710's input on this before merging

@MoralCode MoralCode mentioned this pull request Oct 29, 2025
1 task
@MoralCode MoralCode added this to the v0.91.0 Release milestone Oct 29, 2025
MoralCode and others added 2 commits October 30, 2025 15:31
…pped

Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Co-authored-by: Mahmoud Abdelrazek <44040283+razekmh@users.noreply.github.com>
Co-authored-by: Andrew Brain <andrewbrain2019@gmail.com>
…bject as well.

Co-Created by: gpt-5 via cursor

Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MoralCode MoralCode merged commit a1a1339 into main Nov 3, 2025
15 checks passed
@MoralCode MoralCode deleted the null_cntrb_email branch November 3, 2025 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix Fixes a bug discussion Seeking active feedback, usually for items under active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contributors information being removed from the augur.contributors table during data collection

5 participants