Skip to content

Conversation

@hagenw
Copy link
Member

@hagenw hagenw commented May 13, 2025

Closes #466

Fixes audformat.Database.update() for the case of misc tables. The error was not covered by our previous test. I'm still not completely sure why, but to save time I added a new test for the particular problem.

The error was due to us looking only at self.tables for existing tables in the database that should be updated. I changed it to list(self) to include misc tables as well.

Summary by Sourcery

Fix misc table handling in Database.update by including them in the update loop and add a corresponding test.

Bug Fixes:

  • Support merging miscellaneous tables with new columns in Database.update.

Tests:

  • Add test_update_misc_table to verify merging of misc tables when new columns are introduced.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented May 13, 2025

Reviewer's Guide

Enhanced Database.update() to correctly merge miscellaneous tables by checking all existing table IDs and added a targeted test to cover this scenario.

File-Level Changes

Change Details Files
Include misc tables in Database.update() merging logic
  • Aggregated other.misc_tables and other.tables in iteration
  • Replaced membership check self.tables with list(self) to detect misc tables
  • Merged or copied tables based on updated existence check
audformat/core/database.py
Add test for updating misc tables with new columns
  • Introduced test_update_misc_table to simulate two databases with misc tables
  • Set up new column addition in misc tables and concatenated expected DataFrame
  • Asserted that update merges DataFrames correctly and preserves the source DB
tests/test_database.py

Assessment against linked issues

Issue Objective Addressed Explanation
#466 When combining two databases with a misc table using audformat.Database.update(), the misc table should be combined in the end as well.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@hagenw hagenw force-pushed the fix-update-misc-table branch from 99df383 to f9c53db Compare May 13, 2025 09:59
@codecov
Copy link

codecov bot commented May 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (2872a9a) to head (f9c53db).
Report is 1 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
audformat/core/database.py 100.0% <100.0%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hagenw hagenw marked this pull request as ready for review May 13, 2025 10:03
@hagenw hagenw requested a review from frankenjoe May 13, 2025 10:03
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hagenw - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@hagenw
Copy link
Member Author

hagenw commented May 13, 2025

One thing I realized when debugging the existing test is that update() can also change the content of the table which is used for to update, which I thought should not be the case, e.g. at the beginning we have:

db["misc"].df=        float                                                                                                                                                   idx                                                                                                                                                                           
a    0.137095                                                                          
b    0.783854                                                                          
other1["misc"].df=        float labels                                                 
idx                                                                                    
b    0.281209      c                                                                   
c    0.875555      b

But after the first failed attempt we have:

db["misc"].df=        float              
idx          
b    0.783854                                                                          
c         NaN                                                                                                                                                                 
a    0.137095                                                                          
other1["misc"].df=        float labels                                                                                                                                        
idx                                                                                                                                                                           
b    0.281209      c
c    0.875555      b                                                                   
a         NaN    NaN

which means other1["misc"].df did also change.

But this is most likely another issue and not related to what we try to fix here.

@frankenjoe
Copy link
Collaborator

One thing I realized when debugging the existing test is that update() can also change the content of the table which is used for to update, which I thought should not be the case

I had a look and I found this line here. Here we see that the index of the table that is used to updated is indeed extended. I agree that this is a somehow unexpected behavior. I created #469.

@frankenjoe
Copy link
Collaborator

The error was not covered by our previous test. I'm still not completely sure why, but to save time I added a new test for the particular problem.

The test function that currently covers Database.update() is already awfully long. So it's anyway a good idea to add a new specific test.

@frankenjoe
Copy link
Collaborator

Since the suggested change fixes the issue I will merge now.

@frankenjoe frankenjoe merged commit 0818ee0 into main May 14, 2025
12 checks passed
@frankenjoe frankenjoe deleted the fix-update-misc-table branch May 14, 2025 11:46
@hagenw
Copy link
Member Author

hagenw commented May 14, 2025

The error was not covered by our previous test. I'm still not completely sure why, but to save time I added a new test for the particular problem.

The test function that currently covers Database.update() is already awfully long. So it's anyway a good idea to add a new specific test.

Yes, during the last year I also figured out that there are better ways to write long tests that depend on a status of an object like a database. One way is to use a class which ensures that the status of the database is the same for each of the single tests. One example can be found at https://github.com/audeering/audbcards/blob/a6a5cc1b356633965c1e39a1200cb61a5aae7c0b/tests/test_dataset.py#L507-L588.
But it would be indeed some effort to rewrite our tests in audformat accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MiscTable is not handled correctly by audformat.Database.update()

3 participants