Skip to content
This repository has been archived by the owner on Sep 20, 2023. It is now read-only.

Optimizing SQLAlchemy's get_columns #75

Closed
mrshu opened this issue Feb 6, 2019 · 4 comments
Closed

Optimizing SQLAlchemy's get_columns #75

mrshu opened this issue Feb 6, 2019 · 4 comments

Comments

@mrshu
Copy link

mrshu commented Feb 6, 2019

Hey @laughingman7743, it's @mrshu again!

As I said in laughingman7743/PyAthena#63, thanks a ton for putting together these two wrapper packages!

I have just a very small beef with PyAthenaJDBC -- aside from slowness mostly caused by necessity of running JVM, it takes nearly forever to get columns using SQLAlchemy in the setup I am using. After digging a bit deeper, I found that this is actually by design, and that the get_columns method starts with the following comment.

# information_schema.columns fails when filtering with table_schema or table_name
# when specifying a name that does not exist in table_schema or table_name.

I went ahead to test it out and can report that the following query

SELECT
  table_schema,
  table_name,
  column_name,
  data_type,
  is_nullable,
  column_default,
  ordinal_position,
  comment
FROM information_schema.columns
WHERE 
  table_schema = 'non_existent_schema'
  AND table_name = 'non_existent_table'

does not fail but returns zero rows. I am not sure if something changed since when the get_columns query has been introduced in #14, but I was wondering whether you would be up for making it a bit more efficient?

Once again, I am happy to put together a PR that does this, I'd just like to make sure it fits into your perspective on what should PyAthenaJDBC look like as well.

Thanks again!

@laughingman7743
Copy link
Owner

Since the exception occurs when I implement that, I once acquired the information of all the columns and filtered it on the program side.
Just like PyAthena's implementation, filtering with a where clause sounds nice. I will try it.

@mrshu
Copy link
Author

mrshu commented Feb 6, 2019

Thanks for your response @laughingman7743, please feel free to let me know if I can be of any help!

@mrshu
Copy link
Author

mrshu commented Feb 7, 2019

@laughingman7743 Let me just reiterate once again that I am happy to put together a PR that would use filtering on the SQL level, as I suggested above.

Thanks!

@laughingman7743
Copy link
Owner

I tried implementing it. Please check the following branches.
#76

laughingman7743 added a commit that referenced this issue Feb 9, 2019
…qlalchemy

Optimize get_columns method in SQLAlchemy (fix #75)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants