Skip to content

Allow passing HEADERS=[AUTO/FORCE/DISABLE] st_read() to control xlsx behavior. #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 11, 2023

Conversation

Maxxen
Copy link
Member

@Maxxen Maxxen commented Aug 11, 2023

This PR makes it possible to pass a HEADERS=[AUTO/FORCE/DISABLE] option to st_read() to control whether or not to read the first row as a header row or not when importing xlsx. We do this by setting a thread local config option programatically before opening the dataset, and resetting it after binding. Previously you would have to set this GDAL config option through an environment variable or an external configuration file, but now its possible to control it per-query and from within duckdb.

Example:

FROM ST_read('./output.xlsx', open_options = ['HEADERS=FORCE']);

Note that this workaround won't be necessary anymore since GDAL 3.8 (scheduled for November...?) as the XLSX options will be possible to pass as open options instead of configuration options natively.

@davidjbrennan
Copy link

That's great news @Maxxen.

Are you also able to include the only other configuration option "FIELD_TYPES=[STRING/AUTO]".

This would make for a good work-around until the release of GDAL 3.8.

@Maxxen
Copy link
Member Author

Maxxen commented Aug 11, 2023

You're right, good catch. I've added FIELD_TYPES=[AUTO/STRING] option as well now.

@wonb168
Copy link

wonb168 commented Aug 11, 2023

This PR makes it possible to pass a HEADERS=[AUTO/FORCE/DISABLE] option to st_read() to control whether or not to read the first row as a header row or not when importing xlsx. We do this by setting a thread local config option programatically before opening the dataset, and resetting it after binding. Previously you would have to set this GDAL config option through an environment variable or an external configuration file, but now its possible to control it per-query and from within duckdb.

Example:

FROM ST_read('./output.xlsx', open_options = ['HEADERS=FORCE']);

Note that this workaround won't be necessary anymore since GDAL 3.8 (scheduled for November...?) as the XLSX options will be possible to pass as open options instead of configuration options natively.

even use options,it still wrong for sheet1 in my excel in attachment,please try how to work for sheet1,tkx
!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants