Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datastore Key's reference is missing location identifier in app argument in to_legacy_urlsafe() #3597

Closed
michaelenglo opened this issue Jul 10, 2017 · 9 comments · Fixed by #4635
Assignees
Labels
api: datastore Issues related to the Datastore API. type: question Request for information or clarification. Not an issue.

Comments

@michaelenglo
Copy link

@dhermes I put this as a new issue because I thought people might have a problem with this too. Please feel free to edit/remove this issue if it is not appropriate to put it here.

I was running pip install --upgrade google-cloud-datastore to get latest version that has the key.to_legacy_urlsafe() implemented in #3491. I tested the method and I found out that the urlsafe generated from the method is still not the same as datastore's urlsafe.

I have implemented my own function to generate the urlsafe key before the update was released and so far it worked properly for me (I can share my code if anybody wants). So I compared my function with the to_legacy_urlsafe and I found the problem:

reference = _app_engine_key_pb2.Reference(
        app=self.project,
        path=_to_legacy_path(self._path),  # Avoid the copy.
        name_space=self.namespace,
        )

In the app argument, the location identifier is missing unlike in my function where I concatenated the "s~" at the beginning of the string. I think this is the cause of the problem why it was not getting the urlsafe correct, although I don't understand why it needs the location identifier in front of the project name in the first place.

Do you think I should send a PR to fix this?

@dhermes dhermes added the api: datastore Issues related to the Datastore API. label Jul 10, 2017
@dhermes
Copy link
Contributor

dhermes commented Jul 10, 2017

@michaelenglo I'll let @jonparrott handle this. I voiced this concern when writing the original PR and he said we shouldn't worry about it.

I would suggest that you rely on from_legacy_urlsafe much more than to_legacy_urlsafe, i.e. these are really meant as a way to go FROM legacy TO "modern", not the other direction.

@dhermes
Copy link
Contributor

dhermes commented Jul 10, 2017

As for sending a PR, there is no "right" way to fix this. The app may begin with dev~, s~, e~ and possibly other strings I am unaware of. There is no way to determine which of these is appropriate based on the project string.

@theacodes
Copy link
Contributor

Because there's no "right" way to fix this, I think we should just document that the string returned by our to_legacy_urlsafe is equivalent, but not identical to the key returned by ndb.

@michaelenglo
Copy link
Author

I am a bit concerned about this because urlsafe play an important role in our app. Currently in the implementation of our Document Control app, every document file has its information (in datastore), and its blob (in cloud storage). The way we name each document is by its urlsafe key (since we thought it's always unique, we decided to use it instead of creating app-specific identifier). Now, If we cannot guarantee that both services in Std and Flex generate the same urlsafe key consistently, then there is a chance that in the future we can have corrupted files lost in the database.

Is using urlsafe as unique name for objects something that we should avoid? If yes, then what is the proper way for mapping entities to blob objects in cloud storage?

And I wonder why if NDB has a method to create a ursafe key that is identical from the one in Datastore, what prevents the Client Library from having one?

@dhermes
Copy link
Contributor

dhermes commented Jul 18, 2017

@michaelenglo You can get around this by just using the same identifier here that is using in App Engine standard. It's fairly like your application uses s~ as the project ID. You can deserialize a protobuf from a urlsafe value to find out.

@dhermes
Copy link
Contributor

dhermes commented Jul 18, 2017

And I wonder why if NDB has a method to create a ursafe key that is identical from the one in Datastore, what prevents the Client Library from having one?

ndb runs in App Engine, so it has access to the application ID according to App Engine, which contains the prefix (s~, e~, etc.). If you manually add the correct prefix, you'll get the same values (see above).

@lukesneeringer lukesneeringer added the type: question Request for information or clarification. Not an issue. label Aug 7, 2017
@lukesneeringer
Copy link
Contributor

Hey @michaelenglo,
I just wanted to follow up on this item. I think we are probably leaning toward leaving this as-is (except maybe adding some documentation), but I wanted to give you an opportunity to follow up on it first.

@dhermes
Copy link
Contributor

dhermes commented Aug 10, 2017

@michaelenglo And if you want truly identical behavior in your application, you should determine which prefix your app uses (it's almost certainly s~) and then

manually add the correct prefix

via datastore.Client(project='s~your-app-id')

@rashilaqburst
Copy link

Where can I get the location prefix of my app? Is there a way to programmatically find it? Or is there a list published anywhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants