Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functional test - Add datastore query tests #281

Merged
merged 5 commits into from
Oct 27, 2014

Conversation

dhermes
Copy link
Contributor

@dhermes dhermes commented Oct 22, 2014

This implements projection, offset and group_by on datastore.query.Query. I looked and didn't see any equivalent bugs from them.

As new Query features are outside the scope of functional / regression testing, I am going to peel them off into a separate PR and then rebase this one onto it after that is reviewed. However, please feel free to dig into this review.

Some other things to notice:

  • The value of more_results from storage.Connection.run_query should not be thrown away. #280 was filed as a result of this.

  • In gcloud-node, fetching on a query returns a new query with the cursor set. We set the cursor after a query is completed, but this is not start_cursor so a user is still required to create a new cursor via:

    results = query.fetch()
    new_query = query.with_cursor(query.cursor())

    Also note that in tests like test_limit_queries The value of more_results from storage.Connection.run_query should not be thrown away. #280 is most relevant and the gcloud-node implementation could also benefit from the more_results boolean returned from a connection.

  • Raising a RuntimeError on cursor() access feels wrong. In reality, we'd like to access the start cursor and the end cursor and the one set on Query is one we next really want set, but instead returned as part of fetch results.

  • Running these tests very often (i.e. 3 times a minute) resulted in very flaky results for the filter tests. I tried to address this by doing bulk adds / deletes in a transaction but it still was flaky. /cc @pcostell

  • The gcloud-node implementation seems to allow Dataset.save to take a list of entities and save them in a transaction. This may be worth adding.

  • We haven't implemented keys_only queries though I'm unsure if the Cloud Datastore API supports this.

  • Getting just the dict from two Entitys for comparison should be easier. I suppose we could implement __eq__ here.

  • The code in test_query_paginate_with_offset is much more convoluted than the gcloud-node equivalent. This is mostly because runQuery does a lot under the covers in node land.

  • I am unclear why the group_by query only returns two entities. Maybe someone can fill me in?

@dhermes
Copy link
Contributor Author

dhermes commented Oct 22, 2014

@silvolu Seems like a transient failure but seeing the same issue on #282

@tseaver tseaver added the api: datastore Issues related to the Datastore API. label Oct 22, 2014
@dhermes
Copy link
Contributor Author

dhermes commented Oct 22, 2014

@tseaver Rebased this PR on top of #282 to make the relationship more clear.

@pcostell
Copy link
Contributor

Running these tests very often (i.e. 3 times a minute) resulted in very flaky results for the filter tests. I tried to address this by doing bulk adds / deletes in a transaction but it still was flaky. /cc @pcostell
The gcloud-node implementation seems to allow Dataset.save to take a list of entities and save them in a transaction. This may be worth adding.

The issue is that queries are eventually consistent. We've found that for testing queries it is easiest to just put all of the test data into a single entity group. Then you can run an ancestor query on them.

We haven't implemented keys_only queries though I'm unsure if the Cloud Datastore API supports this.

Datastore just exposes this as a projection of __key__. I think this is probably the better way to expose it as it will allow users to project both __key__ and other properties.

I am unclear why the group_by query only returns two entities. Maybe someone can fill me in?

GROUP BY is designed to only return one result per grouped value. So you'll get one result for alive = True and one for alive = False. Really, you should never be allowed to project more values than you've group_by'd (done effectively here as a SELECT *). This can be used to implement SELECT DISTINCT if your projection is equal to your group by. Then you'd expect to see two results, True and False. Over-projecting will in this case return a random (not really random, but based on sort order) entity in this list. We plan to support MIN/MAX in the backend so that you can say, for example,

SELECT alive, MAX(occurrences) FROM Character GROUP BY alive

This would let you see if a dead character or alive character appeared in more episodes. (Also, spoilers?!?!)

@pcostell
Copy link
Contributor

The code in test_query_paginate_with_offset is much more convoluted than the gcloud-node equivalent. This is mostly because runQuery does a lot under the covers in node land.

I think actually it's better to let the user do more work here. I believe the normal use case for pagination is something like this:

  1. User goes to website

  2. We run:

    curr_cursor = request.get('next_page', None)
    query = makeQuery().limit(page_size)
    if curr_cursor
     query.start_cursor(curr_cursor)
    results = query.fetch()
    next_page_link = url + "&next_page=" + query.cursor()
    prev_page_link = url + "&prev_page=" + curr_cursor
  3. Return list to user with prev and next page

In this case, there isn't much benefit to returning the next query to run to the user as they are going to need to reconstruct it themselves in a separate request.

@dhermes dhermes force-pushed the functional-test-part2 branch 2 times, most recently from ccec7be to f26dfc5 Compare October 22, 2014 21:54
@dhermes
Copy link
Contributor Author

dhermes commented Oct 22, 2014

@pcostell I was referring to paging with offsets instead of cursors, which is obviously frowned upon.

As for the ancestor query, this doesn't seem to be done in the node tests. Are those tests flaky?

Also, what is the recommended way to implement ancestor queries. It seems we'll need to do this before using them.

@tseaver
Copy link
Contributor

tseaver commented Oct 22, 2014

Query objects support filtering by ancestor, E.g., to clear items from a parent::

from gcloud.datastore.key import Key
from gcloud.datastore.query import Query
query = Query(datastore, 'Item')
parent = Key.from_path(['Parent', 'parent_name'])
for item in query.ancestor(parent.key()).fetch():
    item.delete()

@tseaver
Copy link
Contributor

tseaver commented Oct 22, 2014

Travis failiure:

lint runtests: commands[1] | python run_pylint.py
************* Module regression.datastore
E:199,20: Unexpected keyword argument 'dataset' in constructor call (unexpected-keyword-arg)
E:263,22: Unexpected keyword argument 'dataset' in constructor call (unexpected-keyword-arg)
Pylint failed on test and demo code with status 2.

'regression/datastore.py' needs to sync w/ #276 (keys no longer take a
'dataset' argument in 'init', 'from_path', or 'from_protobuf').

@dhermes dhermes force-pushed the functional-test-part2 branch 2 times, most recently from 6bdc25d to 15cd053 Compare October 23, 2014 01:21
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 15cd053 on dhermes:functional-test-part2 into 2da8654 on GoogleCloudPlatform:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 15cd053 on dhermes:functional-test-part2 into 2da8654 on GoogleCloudPlatform:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 621b0e9 on dhermes:functional-test-part2 into 2da8654 on GoogleCloudPlatform:master.

dhermes added a commit to dhermes/google-cloud-python that referenced this pull request Oct 23, 2014
@dhermes
Copy link
Contributor Author

dhermes commented Oct 23, 2014

@pcostell I reworked this code to use queries on data that had persisted for a long time instead of querying immediately after insert.

Running this twice in Travis, both filter queries (test_query_multiple_filters and test_query_simple_filter) failed:
https://travis-ci.org/dhermes/gcloud-python/builds/38797026#L290-L303
https://travis-ci.org/dhermes/gcloud-python/builds/38796835#L460-L473

Is there something I'm missing on the node side? They don't seem to have issues passing
https://travis-ci.org/GoogleCloudPlatform/gcloud-node/builds/38839607#L581-L582
https://travis-ci.org/GoogleCloudPlatform/gcloud-node/builds/38657797#L584-L585

It seems these empty queries (i.e. 0 != 4, for the expected number of filtered results) are due to a "tablet that isn't hot", but my understanding of HRD is very likely outdated.

Also, FWIW I reworked to make them all ancestor queries under Eddard and they gave the same transient / flaky failures (on my local machine).

dhermes added a commit to dhermes/google-cloud-python that referenced this pull request Oct 23, 2014
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 972a796 on dhermes:functional-test-part2 into 09859db on GoogleCloudPlatform:master.

@pcostell
Copy link
Contributor

Interesting. You're almost certainly guaranteed to hit eventual consistency problems. I'm not sure why the node ones don't fail, perhaps they've just gotten lucky so far.

If you're issuing ancestor queries, you are guaranteed to get strong consistency. So I think the flakiness might be coming from somewhere else.

When you say you are running locally, do you mean issuing commands from your local computer to a production Cloud Datastore dataset? Or are you also running locally using gcd? If you use gcd you can set the consistency using --consistency=1.0 so that all queries are strongly consistent. You could use this to determine if your queries are flaky due to eventual consistency or some other reason.

@dhermes
Copy link
Contributor Author

dhermes commented Oct 23, 2014

Sorry, I meant running on my local machine (via tox -e regression with this PR) but targeting a live Cloud Datastore dataset.

I'll double-check on the flakiness with ancestor queries and commit it / point you at the commit if I can reproduce the flakiness.

dhermes added a commit to dhermes/google-cloud-python that referenced this pull request Oct 23, 2014
parthea added a commit that referenced this pull request Jun 4, 2023
* chore(deps): update all dependencies

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Co-authored-by: Anthonios Partheniou <partheniou@google.com>
parthea pushed a commit that referenced this pull request Jun 4, 2023
…281)

Source-Link: googleapis/synthtool@8e55b32
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:c6c965a4bf40c19011b11f87dbc801a66d3a23fbc6704102be064ef31c51f1c3
parthea pushed a commit that referenced this pull request Jun 4, 2023
Source-Link: googleapis/synthtool@f8077d2
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:dfa9b663b32de8b5b327e32c1da665a80de48876558dd58091d8160c60ad7355

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
parthea added a commit that referenced this pull request Jun 4, 2023
Co-authored-by: Anthonios Partheniou <partheniou@google.com>
parthea pushed a commit that referenced this pull request Jun 4, 2023
* chore: use gapic-generator-python 0.63.2
docs: add generated snippets

PiperOrigin-RevId: 427792504

Source-Link: googleapis/googleapis@55b9e1e

Source-Link: googleapis/googleapis-gen@bf4e86b
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYmY0ZTg2Yjc1M2Y0MmNiMGVkYjFmZDUxZmJlODQwZDdkYTBhMWNkZSJ9

* 🦉 Updates from OwlBot

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this pull request Jun 4, 2023
* feat: Added support for accessing secret versions by alias
Clients can now associate custom strings with specified secret versions for later access.

chore: remove obsolete samples

PiperOrigin-RevId: 439320490

Source-Link: googleapis/googleapis@bbe5618

Source-Link: googleapis/googleapis-gen@6bdfcfd
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNmJkZmNmZDg3OTc0MGM2MmJiZTExYjJlYmM2YjgzNzFmMGQ0MjBhZiJ9

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this pull request Aug 15, 2023
- [ ] Regenerate this pull request now.

feat(v3beta1): added data format specification for export agent

PiperOrigin-RevId: 437848093

Source-Link: googleapis/googleapis@daffb06

Source-Link: googleapis/googleapis-gen@b851ca5
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiYjg1MWNhNWNjNTkyYTViOWM4NjVjYmU2NzczNmQ2NzIwNTA3MmRjZSJ9
parthea pushed a commit that referenced this pull request Aug 15, 2023
Source-Link: https://togithub.com/googleapis/synthtool/commit/d0f51a0c2a9a6bcca86911eabea9e484baadf64b
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:240b5bcc2bafd450912d2da2be15e62bc6de2cf839823ae4bf94d4f392b451dc
parthea added a commit that referenced this pull request Sep 20, 2023
* chore(python): add missing import for prerelease testing

Source-Link: googleapis/synthtool@d2871d9
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:b2dc5f80edcf5d4486c39068c9fa11f7f851d9568eea4dcba130f994ea9b5e97

* check samples failure

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* lint

* check samples failure

* add 30 seconds to request timestamp

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Co-authored-by: Anthonios Partheniou <partheniou@google.com>
parthea added a commit that referenced this pull request Sep 20, 2023
)

Co-authored-by: Anthonios Partheniou <partheniou@google.com>
parthea pushed a commit that referenced this pull request Sep 22, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this pull request Sep 22, 2023
Source-Link: googleapis/synthtool@d2871d9
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:b2dc5f80edcf5d4486c39068c9fa11f7f851d9568eea4dcba130f994ea9b5e97

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this pull request Sep 22, 2023
…[autoapprove] (#281)

Source-Link: googleapis/synthtool@1f37ce7
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:8e84e0e0d71a0d681668461bba02c9e1394c785f31a10ae3470660235b673086
parthea added a commit that referenced this pull request Sep 22, 2023
* chore: add default_version and codeowner_team to .repo-metadata.json

* update default_version
parthea pushed a commit that referenced this pull request Sep 22, 2023
- [ ] Regenerate this pull request now.

docs: list oneofs in docstring
fix(deps): require google-api-core >= 1.28.0
fix(deps): drop packaging dependency
fix: fix extras_require typo in setup.py

committer: busunkim96@
PiperOrigin-RevId: 406468269

Source-Link: googleapis/googleapis@83d81b0

Source-Link: googleapis/googleapis-gen@2ff001f
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiMmZmMDAxZmJhY2I5ZTc3ZTcxZDczNGRlNWY5NTVjMDVmZGFlODUyNiJ9
parthea pushed a commit that referenced this pull request Sep 22, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
parthea added a commit that referenced this pull request Sep 22, 2023
* chore(deps): update all dependencies

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* revert

* revert

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Co-authored-by: Anthonios Partheniou <partheniou@google.com>
parthea pushed a commit that referenced this pull request Oct 21, 2023
- [ ] Regenerate this pull request now.

PiperOrigin-RevId: 394579113

Committer yuwangyw@

Source-Link: googleapis/googleapis@9c7eb1f

Source-Link: googleapis/googleapis-gen@5934384

feat(v1): Add content type Relationship to support relationship search
parthea pushed a commit that referenced this pull request Oct 21, 2023
…281)

Source-Link: googleapis/synthtool@6b4d5a6
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:f792ee1320e03eda2d13a5281a2989f7ed8a9e50b73ef6da97fac7e1e850b149

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this pull request Oct 21, 2023
Source-Link: googleapis/synthtool@f15cc72
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:bc5eed3804aec2f05fad42aacf973821d9500c174015341f721a984a0825b6fd
parthea pushed a commit that referenced this pull request Oct 21, 2023
Source-Link: googleapis/synthtool@571ee2c
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:660abdf857d3ab9aabcd967c163c70e657fcc5653595c709263af5f3fa23ef67
parthea pushed a commit that referenced this pull request Oct 21, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
parthea pushed a commit that referenced this pull request Oct 21, 2023
Source-Link: googleapis/synthtool@56da63e
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:993a058718e84a82fda04c3177e58f0a43281a996c7c395e0a56ccc4d6d210d7
parthea pushed a commit that referenced this pull request Oct 22, 2023
…ve] (#281)

Source-Link: https://togithub.com/googleapis/synthtool/commit/6ed3a831cb9ff69ef8a504c353e098ec0192ad93
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:3abfa0f1886adaf0b83f07cb117b24a639ea1cb9cffe56d43280b977033563eb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants