Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st_read continually fails if first attempt was unable to connect to target host #221

Open
gallodustin opened this issue Dec 28, 2023 · 9 comments

Comments

@gallodustin
Copy link

Below is a minimal repro of the issue I am seeing with the Python SQLAlchemy DBAPI when attempting to use st_read to load Excel files. In this case I'm using in-memory DuckDB, with a local SeaweedFS S3 store which runs as a webservice in a Docker container.

import sqlalchemy

connect_args = {'config': {'s3_region': 'myregion', 's3_access_key_id': 'mykey', 's3_secret_access_key': 'mysecret', 's3_endpoint': 'hostname:port', 's3_url_style': 'path', 's3_use_ssl': False}}
engine = sqlalchemy.create_engine('duckdb:///:memory:', connect_args=connect_args)
# manually bring down the s3 host
try:
    with engine.connect() as conn:
        conn.execute("INSTALL spatial; LOAD spatial;")
        result = conn.execute("SELECT * FROM st_read('http://hostname:port/bucket/file.xlsx', layer='sheet1')") # IO Error: GDAL Error (1): Could not resolve host: hostname
        # subsequent identical conn.execute here also gives the GDALOpen() called recursively error
finally:
    engine.dispose()
# manually start the s3 service at host:port where bucket/file.xlsx exists
try:
    with engine.connect() as conn:
        conn.execute("INSTALL spatial; LOAD spatial;")
        result = conn.execute("SELECT * FROM st_read('http://hostname:port/bucket/file.xlsx', layer='sheet1')") # IO Error: GDAL Error (1): GDALOpen() called on http://hostname:port/bucket/file.xlsx recursively
finally:
    engine.dispose()

What I am trying to illustrate here is that, if st_read fails to connect to the target host on the first attempt, all subsequent attempts will also fail even if the host becomes responsive and the file exists in the location specified. In other words, without simulating a network outage by manually bringing down my S3 service and bringing it back up after the first failed attempt, all of these st_read calls work fine.

To me it's particularly problematic that this still occurs if I call engine.dispose() because in that case I expect a "fresh" connection without knowledge of the past failure. But even without engine.dispose() I would prefer st_read to try again and not throw the "called recursively" error in order to handle (common) real-world cases where connectivity was temporarily lost and we want to try again.

I'm seeing this on duckdb version 0.9.1, but have observed the same behavior on 0.9.3-dev1996.

@Maxxen
Copy link
Member

Maxxen commented Dec 29, 2023

Hi! Thanks for reporting this issue!
The filesystem abstractions have gone through a lot of changes since the last release and latest development release was issued and I suspect this has already been fixed. See eg #182 #218 for related excel issues.

You can get the latest spatial version (built every time a PR is merged to main in this repo) for DuckDB 0.9.1 by executing:

FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org';

Note that this is a newer version than the one you get if you use DuckDB 0.9.3.

Also, do you have the httpfs extension installed?

@gallodustin
Copy link
Author

Also, do you have the httpfs extension installed?

Based on the docs I didn't think there was a need to install it explicitly. Is that incorrect? Besides this issue, I am in general able to connect to S3 using st_read similar to my example above.

I went ahead and explicitly installed httpfs as well as the latest version of spatial as you suggested, and I'm now seeing a different issue. The first attempt at st_read returns an "invalid file type" error, and then all subsequent attempts give a "GDALOpen() called on [file] recursively" error. This is all with the S3 host live and able to receive connections, I never brought it down. Here is the new code example:

import sqlalchemy

connect_args = {'config': {'s3_region': 'myregion', 's3_access_key_id': 'mykey', 's3_secret_access_key': 'mysecret', 's3_endpoint': 'hostname:port', 's3_url_style': 'path', 's3_use_ssl': False}}
engine = sqlalchemy.create_engine('duckdb:///:memory:', connect_args=connect_args)
try:
    with engine.connect() as conn:
        conn.execute("INSTALL httpfs; LOAD httpfs;")
        conn.execute("FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org'; LOAD spatial;")
        conn.execute("SELECT * FROM st_read('http://hostname:port/bucket/file.xlsx', layer='sheet1')") # IO Error: Unknown file type
        conn.execute("SELECT * FROM st_read('http://hostname:port/bucket/file.xlsx', layer='sheet1')") # IO Error: GDAL Error (1): GDALOpen() called on http://hostname:port/bucket/file.xlsx recursively
finally:
    engine.dispose()

@Maxxen
Copy link
Member

Maxxen commented Dec 29, 2023

Alright, thanks for the thorough investigation! In general you do/will need the httpfs extension afaik it doesnt autoload properly when spatial tries to issue a remote call, but that might also have changed. Ill look into this properly when im back from vacation in a couple of weeks!

@Maxxen
Copy link
Member

Maxxen commented Jan 10, 2024

So im investigating this more now. My working theory is that the "called recursively" errors is due to us just being a little too aggressive when erroring and throwing exceptions when we can't determine the file type, which doesn't allow GDAL to do the proper cleaning up, which causes the "called recursively" errors. I think I have a fix for this as its pretty simple.

What Im still trying to figure out is why you get the unknown file type error, I feel like we got all the cases covered there, but there must be something rewriting the path so that its no longer recognizable as a remote file. Are you on Windows perhaps?

@gallodustin
Copy link
Author

No, I am running Arch Linux with kernel version "Linux 6.6.10-arch1-1".

@gallodustin
Copy link
Author

Also, looking into this again just now, I still see the "unknown file type" error on duckdb 0.9.1 together with the latest version of spatial from nightly. But now, if I use pip to upgrade duckdb to 0.9.3.dev2452 I instead see a failure when trying to run FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org'; LOAD spatial;.

This may be expected as you work on a resolution, but I just thought I'd point it out.

E               sqlalchemy.exc.OperationalError: (duckdb.duckdb.HTTPException) HTTP Error: Failed to download extension "spatial" at URL "http://nightly-extensions.duckdb.org/8793ff3a2e/linux_amd64_gcc4/spatial.duckdb_extension.gz"
E               Extension "spatial" is an existing extension.
E               
E               Are you using a development build? In this case, extensions might not (yet) be uploaded.
E               [SQL: FORCE INSTALL spatial FROM 'http://nightly-extensions.duckdb.org'; LOAD spatial;]
E               (Background on this error at: https://sqlalche.me/e/14/e3q8)

@gallodustin
Copy link
Author

gallodustin commented Jan 10, 2024

One more happier observation. The same set of steps actually worked without issue on duckdb 0.9.2 together with the latest spatial from nightly.

Edit: it even works with the default version of spatial I get from just "INSTALL spatial". If I try to load a nonexistent file like "file.abcdef" then the second failure is the "recursive" error on 0.9.1, but not on 0.9.2, regardless of whether I use the default or nightly version of spatial.

@Maxxen
Copy link
Member

Maxxen commented Jan 11, 2024

Hmm, I think maybe you needed to reload to have the new extension apply. Nevertheless I've fixed the issue with getting stuck on the recursive errors in #227, so if this doesn't occur again I think we can close this as fixed.

@gallodustin
Copy link
Author

Not sure if I totally understand what you mean by needing to reload, but I still see the same error about Extension "spatial" is an existing extension if I try the test again today on duckdb-0.9.3.dev2490. As long as that is, as the error suggests, just an issue with the extension not being uploaded quite yet to that newest build, I agree we should be okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants