-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datastore: async/await support #2
Comments
Cool! Not quite what I was asking for here though; ideally I'd like to use the datastore library rather than gRPC directly. Once gRPC supports asyncio, is it likely that the datastore library will make Meanwhile it seems that the gRPC Python API already has a Edit: As for my usage, it's for a hobby project I've got on Google App Engine (https://vimhelp.org) which I'm looking to migrate from Python 2.7 to Python 3.7/3.8. It's currently using ndb, but it's not much code and I'd like to migrate it to whatever is the nicest way of doing things :) |
Big +1! I recently commented on this on googleapis/google-cloud-python#3329 (comment) I'm also using the google cloud datastore library on Google App Engine (Python 3). |
@dannymilsom Reviews and comments are welcomed. Also, can you provide more details? What is the most important value of providing AsyncIO API for you? E.g. improving performance, compatible with existing AsynciO application, or better programming practice... |
+1 too. https://docs.djangoproject.com/en/3.0/topics/async/ Django is planning to support fully async views and middleware soon, and I would very much like to use datastore on app engine! |
@lidizheng Hi! We build lots of web apps on GCP - particularly Google App Engine with Datastore. Historically this has centred around Django which has always been WSGI only, but there are now more and more Python web frameworks moving to support ASGI (we are actively using FastAPI and Django even has it on the 3.0 roadmap. Leveraging the async API in the cloud datastore library would almost definitely enable better performance and lower running costs on cloud services. |
Just to also add that this seems to be a popular feature request from the community based on older issues - see googleapis/google-cloud-python#3103 |
Django async support has now been implemented. Any update on this? @crwilcox |
BUMP for this issue |
+1 |
I expect we will be adding Async Support to datastore in the near future, though development hasn't started yet on datastore. Currently, we have a dev release for We are also working to add Async to our REST clients, with the first being |
@crwilcox Do you have any news about this feature? |
@ndavydovdev I do not. The 2.0.0 version of datastore has async surface, but it is at the generated proto layer, so while you could use it, it isn't necessarily as ergonomic as the sync datastore surface currently. The state of the main branch reflects that today as does 2.0.0dev1 on pypi.org. |
https://pypi.org/project/google-cloud-datastore/2.0.0/ contains async surfaces at the generated layer. You can construct a client using |
@crwilcox Is the generated client considered stable, both in terms of functionality, as well as interface going forward? I'm asking because I don't see any reference to it in any official doc, and I'd like to know if we can rely on it for the long run. |
Hi @dolev-isp this may be leaking a bit of implementation detail :) Happy to elaborate. The short version is you can rely on the generated layer, and we will version using https://semver.org/ to determine the next release number. We have been doing that with this library so far in fact. As far as plans, we don't currently have a plan to add a handwritten generated client. We very recently published Firestore with an async interface, but want to better understand use before implementing in datastore. More on the library, and generated vs handwritten code. If you look at the current client:
The docs at https://googleapis.dev/python/datastore/latest highlight the handwritten layer as it is tailored/crafted to better suit datastore use cases. This isn't to say the generated layers aren't usable clients. The vast majority of Cloud Client Libraries are generated and are considered stable GA surface. The underlying generator is the same for the different Python client libraries. You will find the generated client is very similar to the handwritten layer but the surface is different. Though if you do want async today this is the way you can get it. The interface is a bit more verbose to be sure.
|
Hi @crwilcox thanks for the in-depth reply, and for adding the async support! I have to say it would be really awesome if you would be able to add async functions to the handwritten client library, which is really easy and straightforward to use. Our needs are very straightforward too... We would be glad to see the different client methods in an async version, in a similar way to the way they existed in the NDB library, for example. So you'd have client.get_async(), client.get_multi_async(), client.put_async(), etc. An async version of the Query Iterator would be awesome too, although a bit more complicated I guess (and at a lesser priority in our case for now)... Do you think this is something that can be added to the near-term roadmap? |
Hi @crwilcox I've been trying to implement the outer async api for my project with the parts that are already made and using firestore as a template, but I'm facing an issue when running in pytest. If I have two identical tests that have just
The first test passes through this part as expected, the second test freezes on |
@ArcLightSlavik are you using |
@crwilcox Yeah I am
Python 3.7 |
Fixed by upgrading |
@crwilcox 🤞 Will this feature get merged in soon? I'd be thrilled to use it on my app ❤️ |
@gnagel development for this feature has not started at this time. |
@crwilcox Would you be willing to accept a PR on it? |
@pmlanger I would be alright with that, but I want to be clear this may be is a large work item. We recently did this for Firestore and I think it was in the area of 1000-2000 lines of change. A lot of this is in duplicating test coverage and altering to cover slightly different async surfaces. At a high level the work to accomplish this:
https://github.com/googleapis/python-firestore could certainly be referenced to see what we would have in mind for this work. The way the datastore library is built is that it has a handwritten layer (https://github.com/googleapis/python-datastore/tree/master/google/cloud/datastore) that is placed over a generated client. We use this generated layer in most all of our Google Cloud libraries. It is unique that for our database products that we tend to layer over them to combine multiple RPCs and make a better user experience. That said, some calls aren't as complicated, and you might be able to move some of this into your app by using our generated surface, which is at https://github.com/googleapis/python-datastore/tree/master/google/cloud/datastore_v1. The large difference between the two layers is that the generated client takes a |
@crwilcox Thanks for the details and the references! I actually wrote a slightly restricted version of that handwritten 'async datastore client' for a (non-open source) project a few weeks ago that had been using the ('synchronous') I know it's not a small feat, but as long as there is not too much a rush, I'd like to help. |
I appreciate the question. I just want to be clear the scale. I'd hate for that to sneak up on anyone. The underlying gRPC library has an async path, used by the generated async later, and using that would be better than wrapping the sync surface behind async. I don't think anyone from our team is going to get to this, at the soonest, before April, as we have a few other projects we are focused on. Realistically it may be after that even. I do think adding the surface is valuable though, it just isn't "top of the stack" at the moment. |
Great - I appreciate your being clear on this. Honestly, if there was no one using it, I wouldn't mind and do this just for fun and to clean up my application :-)
I am not 100% sure what you mean by "better than wrapping the sync surface behind async". But if you mean not to import helpers etc. from the sync portion (Client,Batch,Query/Iterator,..) into the async one, we are on the same page. I do get the idea of factoring out the common portions (e.g., creating protobufs in the correct way for each operation), and using them from both the sync and (to be created) async clients. |
I think we are on the same page. I was trying to say that there is a gain to using gRPC's async support instead of wrapping the sync variety in an async wrapper. |
Are there any updates about this feature? Maybe if you have some plans to implement this, I will be ready to help you at my free time. I bet that it's a very wanted and important feature for all Python developers who use google cloud datastore in production. So maybe we can decompose the task about this feature and start the development process with people (as I am) who are interested in it |
I'd like to leave a plus one here. Given that Firestore in Datastore mode is still a useful and recommended setup for some use cases if seems less than ideal to have the Datastore client be less functional. Are there any plans to start on this work? |
Couldn't wait any longer. Tried aiogcd. Working good ! |
We released async support in v2.4.0 but forgot to come back and close this issue: https://github.com/googleapis/python-datastore/releases/tag/v2.4.0 Apologies for the delay (but based on the lack of comments on this issue since late 2021, I think folks figured it out, ha). Thanks! |
@meredithslota that release notes the new async |
It would be awesome if the datastore library supported
async
/await
style operations, e.g.:Something similar was previously suggested in googleapis/google-cloud-python#40. That was folded into googleapis/google-cloud-python#557, which isn't really the same thing, especially since
ndb
does not supportasync
/await
with no plans to add this support (see googleapis/python-ndb#289).The text was updated successfully, but these errors were encountered: