Fix to_segmented_index() for pandas 3.0 #488

hagenw · 2026-01-22T11:27:40Z

Closes #487

This add tests and fixes for the new pandas==3.0.0 timdelta[s] instead of timedelta[ns] default.

This required the following fixes:

Location	Fix
`audformat.utils.to_segmented_index()`	Convert ends to `timedelta64[ns]` before iloc assignment
`audformat.utils.union()`	Normalize timedelta dtypes to `timedelta64[ns]` in all code paths
`audformat.utils.intersect()`	Normalize timedelta dtypes to `timedelta64[ns]`
`audformat.utils.set_index_dtypes()`	Add `.astype(dtype)` after `pd.to_timedelta()` for empty levels
`audformat.segmented_index()`	Call `set_index_dtypes` to ensure `timedelta64[ns]`
`audformat.testing.add_table()`	Remove unnecessary `pd.to_timedelta()` call
`audformat.utils.hash()`	Enforce `object` dtype for string columns to get same hash under Python 3.14

Summary by Sourcery

Ensure segmented index duration handling preserves sub-second precision with pandas 3.0 and later.

Bug Fixes:

Fix to_segmented_index to always use nanosecond timedelta precision so assigning high-precision durations no longer raises type or FutureWarning errors with pandas 3.0.

Tests:

Add regression test verifying to_segmented_index correctly handles sub-second duration values when the index uses second-level timedelta precision and that no FutureWarning is raised.

sourcery-ai · 2026-01-22T11:27:46Z

Reviewer's Guide

Adjusts audformat.utils.to_segmented_index() to preserve nanosecond timedelta precision under pandas 3.0’s new default of second precision, and adds a regression test to guard against FutureWarnings and precision loss when filling NaT segment ends from file durations.

File-Level Changes

Change	Details	Files
Ensure to_segmented_index preserves sub-second timedelta precision under pandas 3.0 and add a regression test for the behavior.	Add a pytest that constructs a MultiIndex with timedelta64[s] start/end levels, applies to_segmented_index with sub-second file durations, and asserts no warnings are raised and precise end times are preserved. Update the logic that replaces NaT entries in the segmented index end level to first convert the ends array to a Series and then cast it to timedelta64[ns] before assigning duration values, preventing dtype incompatibilities and precision loss with pandas 3.0 timedelta defaults.	`tests/test_utils.py` `audformat/core/utils.py`

Assessment against linked issues

Issue	Objective	Addressed	Explanation
#487	Update audformat.utils.to_segmented_index() so it correctly handles pandas 3.0.0's changed default timedelta precision (seconds instead of nanoseconds), preserving sub-second precision and avoiding type/warning issues.	✅
#487	Add tests that verify to_segmented_index() works with pandas 3.0.0, including sub-second duration handling and absence of FutureWarning/TypeError due to timedelta dtype incompatibilities.	✅

Possibly linked issues

Pandas 3.0.0 breaks timedelta precision #487: The PR updates to_segmented_index() to use nanosecond timedelta precision, fixing the pandas 3.0.0 regression.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've left some high level feedback:

Before calling ends = ends.astype("timedelta64[ns]"), consider guarding with a check that ends has a timedelta-like dtype (e.g., using is_timedelta64_dtype) to avoid surprising failures if the index level type changes in the future.
In test_to_segmented_index_timedelta_precision, you can simplify and make the duration comparison more robust by using pandas.testing.assert_index_equal (or assert_series_equal) for result_ends vs expected_ends instead of the manual loop.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Before calling `ends = ends.astype("timedelta64[ns]")`, consider guarding with a check that `ends` has a timedelta-like dtype (e.g., using `is_timedelta64_dtype`) to avoid surprising failures if the index level type changes in the future.
- In `test_to_segmented_index_timedelta_precision`, you can simplify and make the duration comparison more robust by using `pandas.testing.assert_index_equal` (or `assert_series_equal`) for `result_ends` vs `expected_ends` instead of the manual loop.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

hagenw · 2026-01-22T15:05:57Z

We have now too many changes here, and introduced a lot of new issues that were not present before, so we should maybe better target the updates step by step in several pull requests.

hagenw added 2 commits January 22, 2026 12:25

Add failing test

a4ee7c8

Fix to_segmented_index() for pandas 3.0

7e0c5a6

sourcery-ai bot reviewed Jan 22, 2026

View reviewed changes

hagenw added 12 commits January 22, 2026 13:29

Fix warning in audformat/core/testing.py

922aed0

Add test

6fc6cc4

Update union

c0251bb

Update segmented_index()

ff680a7

Add test for segmented_index()

0888d0d

Update audformat.utils.intersect()

2c1e205

Try to fix Python 3.14 errors

a529fd3

Fix Database.dtypes_of_categories

3565659

Fix tests

7761dd4

Normalize categorical dtypes to object

624945a

Further fixes to hash

daf6627

Update tests

3ac3fa5

hagenw marked this pull request as draft January 22, 2026 15:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix to_segmented_index() for pandas 3.0 #488

Fix to_segmented_index() for pandas 3.0 #488

hagenw commented Jan 22, 2026 •

edited

Loading

Uh oh!

sourcery-ai bot commented Jan 22, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

hagenw commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix to_segmented_index() for pandas 3.0 #488

Are you sure you want to change the base?

Fix to_segmented_index() for pandas 3.0 #488

Conversation

hagenw commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

File-Level Changes

Assessment against linked issues

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

hagenw commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hagenw commented Jan 22, 2026 •

edited

Loading

sourcery-ai bot commented Jan 22, 2026 •

edited

Loading