Description
openedon Jan 3, 2024
Code
DataFusion issue with complete details: apache/datafusion#8696
arrow-datafusion runs cargo test
on Windows runner as part of CI. When on Rust version 1.74.1 (and below), the check takes under 30 minutes. After upgrading to Rust version 1.75.0, it now takes over 3 hours, with no other change in code on our side. This seems to only take effect on Windows, as Linux/Mac tests didn't seem to be affected.
After debugging, I found the regression occurs between toolchains nightly-2023-10-29 (rust e5cfc5547)
and nightly-2023-10-30 (rust 608e9682f)
.
We are running on GitHub actions runner windows-latest
.
So expected run is here, on toolchain nightly-2023-10-29
: https://github.com/apache/arrow-datafusion/actions/runs/7394674719/job/20116418078
- Commit: apache/datafusion@c85e67c
- Duration of 23m 43s
When I bump to toolchain nightly-2023-10-30
, with no other code changes: https://github.com/apache/arrow-datafusion/actions/runs/7394848426/job/20116910586
- Commit: apache/datafusion@2e99e21
- Duration of 3h 22m 44s
Slow tests
The slowness occurs primarily in two tests.
tpcds_planning
On the good run (before regression):
- Log: https://github.com/apache/arrow-datafusion/actions/runs/7394674719/job/20116418078#step:5:1886
- Duration: 8m 15s
On the bad run (after regression):
- Log: https://github.com/apache/arrow-datafusion/actions/runs/7394848426/job/20116910586#step:5:1888
- Duration: 1h 24m 59s
sqllogictest
On the good run (before regression):
- Log: https://github.com/apache/arrow-datafusion/actions/runs/7394674719/job/20116418078#step:5:5691
- Duration: 6m 3s (calculated from the timestamps from start of run, until start of next test run)
On the bad run (after regression):
- Log: https://github.com/apache/arrow-datafusion/actions/runs/7394848426/job/20116910586#step:5:5717
- Duration: 1h 46m
These two tests are the only ones with a significant delta, the rest don't seem affected by the upgrade.
Version it worked on
Ran fast with Rust 1.74.1 (and nightly-2023-10-29)
Version with regression
Ran slow on Rust 1.75.0 (and nightly-2023-10-30)
Additional context
Apologies if the example is too large to easily determine where the issue is. I'll try to reduce this to a smaller MRE, as I don't have a Windows machine to locally test on, so have had to check via CI.
Activity