Description
openedon Jun 18, 2024
Observations
When running a few notebooks on pandas 2.x, errors like those can be observed:
TypeError: Could not convert string 'BerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlinBerlin' to numeric
-- https://github.com/crate/cratedb-examples/actions/runs/8975962618/job/24651748395?pr=430#step:6:1839
-- https://github.com/crate/cratedb-examples/actions/runs/8975962618/job/24651748121?pr=430#step:6:825
References
- Time Series: Modernize notebooks to use recent versions of pandas and SQLAlchemy #387
- Update to pandas 2.2 in /topic/timeseries (blocked by PyCaret) #477
Evaluations
- It looks like it is a data shape error.
- Apparently, Google Colab now strictly uses and requires pandas 2.x since 2024-05-13?
Thoughts
It looks like it is a data shape error. Maybe the way the notebooks are working with pandas needs an update when using more recent pandas 2.x? The string repetition flaw reminds me of the famous »Wat« talk by Gary Bernhardt. ;]
Workaround
As a temporary measure, tests stopped including the corresponding notebook. It will get skipped per cfd1a6c, on behalf of the relevant modernization patch.
Time Series: Skip testing notebooks not compatible with pandas 2.x
- exploratory_data_analysis.ipynb
- time-series-decomposition.ipynb
They are not ready for pandas 2.x yet, and block others from being
upgraded.
Originally posted by @amotl in #430 (comment)