-
Notifications
You must be signed in to change notification settings - Fork 527
SNOW-1963078 Port _upload / _download / _upload_stream / _download_st… #2198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1963078 Port _upload / _download / _upload_stream / _download_st… #2198
Conversation
…ream methods to the public connector ### Description - The same implementation has been adopted in SP connector and porting it back to public connector is needed based on [this decision](https://docs.google.com/document/d/1KegQLzyxB5G2Viti5Tr8fD24IPjIKU3cCZtRUUWqm4k) - These methods (_upload / _download / _upload_stream / _download_stream) are necessary to run Stored Proc with public connector ([this](https://github.com/snowflakedb/Stored-Proc-Python-Connector/blob/1e2519f3ddc71547642b007b712aae59ef00dfff/src/snowflake/connector/cursor.py#L808-L833) is how Stored Proc connector implements these methods. The implementation of these methods for public Python connector and SP connector should be exactly the same. It appears a bit different at the moment only because we haven't cleaned up the parameter protection yet) - They are helpful for non-SP sessions as well - The Python Stored Proc has its own logic of file operation parser and stream downloader, so we abstract away the detailed implementation of FileOperationParser and StreamDownloader on the server side - We do not prioritize file operation parser and stream downloader implementation for non-stored-proc here, because it is a non-goal for stored proc fork removal project (this is why the `parse_file_operation` and `download_as_stream` methods are set to raise NotImplemented error ### Testing - existing unit tests and integ tests to make sure the new change does not break them - local testing to show that the new code works for Python SP connection - a new sanity-check test to assert that the needed file utils are present (i.e. parse_file_operation and download_as_stream)
| # check SNOW-1218851 for long term improvement plan to refactor ocsp code | ||
| atexit.register(self._close_at_exit) | ||
|
|
||
| # Set up the file operation parser and stream downloader. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reviewers (especially @sfc-gh-sfan ) SP connector will also need to set connection.rso_id attribute, in order for SP file operation to work as intended (reference: SP connector code). I decided to do the attribute setting on the server side (something like this PoC code), since rso_id is not a general concept for non-SP connections.
The upside of this approach is that we can abstract away unnecessary/SP-specific concept from public connector
The downside (which I don't think will really happen) is that, it seems as if rso_id attribute does not exist in connection (if we only read public connector code). So let's say, if other future developer tries to introduce a seemingly-new attribute rso_id on connection, they may think, "okay, rso_id is not an existing attribute, so I can add and use it freely, without worrying about side effects". But in fact, rso_id would be an existing attribute of connection by that time, and messing the attribute up will affect some Python SP behaviors. But I think the chance is very very slim, because I would be very surprised if any future dev needs to introduce another attribute, and it happens to be literally rso_id
DESCRIPTION.md
Outdated
|
|
||
| # Release Notes | ||
| - v3.14.1(TBD) | ||
| - introduce several file operation utilities for easier internal dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need release note given the added functions are all internal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, reverted the changelog
| use_s3_regional_url=self._use_s3_regional_url, | ||
| unsafe_file_write=self._unsafe_file_write, | ||
| ) | ||
| elif self._stage_location_type == STORED_PROC_FS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have some override mechanism internally so that we don't have to expose this? We could potentially override _create_file_transfer_client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested and yes, we can override _create_file_transfer_client on the server side for Stored Proc. Does this look good 088b553? It stills defines the STORED_PROC_FS label though. We need a file system category to bypass the check here
| if self._stage_location_type not in VALID_STORAGE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I can actually append STORED_PROC_FS to VALID_STORAGE list on the server side. This way we don't need to expose the tag STORED_PROC_FS at all. Let me go with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
This reverts commit abb6f6b.
…we can do the insertion to `VALID_STORAGE` on the fly on the server side
…m/_download_stream methods - the unit tests are more like sanity check of the implementation, to make sure certain methods are hit vs certain methods are not
sfc-gh-mmishchenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…y are not yet available there
# Conflicts: # src/snowflake/connector/connection.py
…ream methods to the public connector
Description
NO-CHANGELOG-UPDATES
_upload/_download/_upload_stream/_download_streammethods (along withdirect_file_operation_utils.py, which provides needed utils for the implementation of the 4 methods) from SP connector to public connector (reference: a similar change on SP connector: https://github.com/snowflakedb/Stored-Proc-Python-Connector/pull/194). The same implementation has been adopted in SP connector and porting it back to public connector is needed based on this decision_upload/_download/_upload_stream/_download_stream) are necessary to run Stored Proc with public connector (this is how Stored Proc connector implements these methods. The implementation of these methods for public Python connector and SP connector should be exactly the same. It appears a bit different at the moment only because we haven't cleaned up the parameter protection yet)parse_file_operationanddownload_as_streammethods are set to raise NotImplemented errorTesting
_uploadand_download_stream, no compression_uploadand_download_stream, with compression_upload_streamand_download, no compression_upload_streamand_download, with compressionPlease answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #NNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.
(Optional) PR for stored-proc connector: