-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement client base and storage endpoints #23
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I obviously did not read the whole thing, because, well, I can't read Python very well. Great job @fnesveda anyway! This is a start of something great for Apify (and the world).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general looks quite good, left a bunch of comments to address / discuss. Thanks for kicking this out!
Thanks for the thorough review, @mhamas! I've addressed your comments, I agreed with most things and updated the code accordingly, the rest we already talked about on Friday. Can you please take another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @fnesveda
content_type = 'application/json; charset=utf-8' | ||
|
||
if 'application/json' in content_type and not _is_file_or_bytes(value) and not isinstance(value, str): | ||
value = json.dumps(value, indent=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, at least add big TODO here. From my previous experience the difference was couple of X for large JSONs. Thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Most of the stuff I would object to has already been mentioned, so I only have a few nitpicks.
src/apify_client/clients/resource_clients/dataset_collection.py
Outdated
Show resolved
Hide resolved
"""Initialize the DatasetClient.""" | ||
super().__init__(*args, resource_path='datasets', **kwargs) | ||
|
||
def get(self) -> Optional[Dict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(applies to all the other clients like this too)
Maybe we could override the [] operator and allow extracting these values directly using the DatasetClient instance? It would be a little more user friendly and wouldn't require much additional work. We could also easily do the same for the setter use-case of [].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be wrong conceptually - when you call client.dataset('someDatasetId')
, you're getting another client, not the dataset itself. Implementing []
would mix these two concepts together, and I think in the end it would be just more confusing to users.
Also, we'd have to download the whole resource again and again for every property we would be accessing. I think that would be weird as well, when you call a function like .get()
, you kinda intuitively expect that it might take some time, but when you access something via the []
operator, you intuitively expect it to be already there and accessible quickly.
We can revisit this later, but for this first version I'd skip it and stick just to the functions.
I've implemented the base of the client and the storage endpoints. It's modeled after the javascript client so the usage is similar. There's still a lot of documentation missing, and I need to implement some integration tests for these, but it should be good as a start of the project.