Description
This is reproducible in current latest Pandas 1.5.2
.
In Python the zipfile.Path
class is intendent to act similar (but not absolute equal!) to pathlib.Path
. The latter is accepted by pandas
but not the first.
Steps to reproduce:
- Create a zip file named
foo.zip
with one an csv-file in it namedbar.csv
. - Create a path object directly pointing to that csv file in the zip file:
zp = zipfile.Path('foo.zip', 'bar.csv')
- Use that path object (
zp
) inpandas.read_csv()
as path object.
Because of that part of your code
Lines 446 to 452 in 3b09765
Python raise an " ValueError: Invalid file path or buffer object type: <class 'zipfile.Path'>".
EDIT:
I'm aware that pandas.read_csv()
do offer the compressions
argument and can read compressed csv files by its own. But this doesn't help in my case. I'm using pandas
as a backend for a more higher level API reading data files. Pandas is just one part of it. And one shortcoming of pandas here is that it is not able to deal with ZIP files containing multiple CSV files.
pathlib.Path
and zipfile.Path
are standard python. And pandas IMHO should be able to deal with it.