Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mango): add allow_fallback to control falling back to other indexes on _find #4792

Merged
merged 3 commits into from
Oct 18, 2024

Conversation

pgj
Copy link
Contributor

@pgj pgj commented Oct 4, 2023

This is an alternative to #4710 based on the results of the related discussion.

It is not always beneficial for the performance if the Mango query planner tries to assign an index to the selector. User-specified indexes may save the day, but since they are only hints for the planner, fallbacks may still happen.

Introduce the allow_fallback flag which can be used to tell if falling back to other indexes is acceptable when an index is explicitly specified by the user or when only the "catch-all" all_docs index could be picked. When set to false, give up on planning and return an HTTP 400 response right away. This way the user has the chance to learn about the requested but missing index, optionally create it and try again.

By default, fallbacks are allowed to maintain backwards compatibility. It is possible to set allow_fallback to true but currently it coincides with the default behavior hence becomes a no-op in practice.

Fixes #4511

Testing recommendations

The changes could be verified by the following commands:

make eunit apps=mango
make mango-test

This is a non-breaking change, so there should be no observable difference in the behavior for the users. When allow_fallback is set to false and use_index is set to some index, it may return HTTP 400 depending on the suitability of the given index.

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Documentation changes were made in the src/docs folder

@pgj pgj mentioned this pull request Oct 4, 2023
4 tasks
@pgj pgj force-pushed the feat/mango/find/use_index_allow_fallback branch from fcb727c to f13501e Compare October 4, 2023 20:48
@pgj pgj force-pushed the feat/mango/find/use_index_allow_fallback branch from f13501e to 449edf0 Compare October 5, 2023 15:10
Copy link
Contributor

@nickva nickva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Nicely done, @pgj!

Perhaps we could get a few additional approvals from anyone involved in the original discussion #4710

@pgj
Copy link
Contributor Author

pgj commented Oct 5, 2023

Yes, I agree. @rnewson, @willholley, @ricellis please comment if you have the time.

@ricellis
Copy link

ricellis commented Oct 5, 2023

I was under the impression that allow_fallback was going to apply to all _find requests not just those with use_index. I know @willholley disagreed with that overloading, but I thought that was only related to the cluster toggles. I didn't think there was an objection to disallowing all_docs fallback as long as applied consistently.

@pgj
Copy link
Contributor Author

pgj commented Oct 9, 2023

I do not want to respond on behalf of @willholley but in my understanding, in his respective comment he opposed to extending the allow_fallback mechanism to _all_docs: "I'm less keen on overloading it to removing the use of the _all_docs index". He also included a historical reference to suggest that it would not give us much: "Mango did originally prevent the _all_docs fallback unless the query contained a clause that would require it. [..] lots of users just blindly added a { _id: { "$gt": 0 } } clause to [..] work around the restriction."

@rnewson
Copy link
Member

rnewson commented Oct 11, 2023

if allow_fallback: false will allow fallback, then it will not be merged.

@pgj
Copy link
Contributor Author

pgj commented Oct 11, 2023

@rnewson do you mean that we shall extend this change to disabling the _all_docs fallback as well?

@rnewson
Copy link
Member

rnewson commented Oct 11, 2023

Yes. Remember this an opt-in in the request body by the user. They are asking us to not fallback; they want an error instead of inefficient execution. I do not understand @willholley's objection to this.

@willholley
Copy link
Member

I don't think this issue / PR has anything to do with query efficiency - only fixing the unclear semantics of use_index.

I'm open to a discussion about if we want to / how to prevent use of inefficient indexes or, but that's much broader problem space in which I don't see _all_docs as any different to user-defined indexes.

@pgj pgj force-pushed the feat/mango/find/use_index_allow_fallback branch from 449edf0 to 322a278 Compare September 30, 2024 12:41
@pgj pgj changed the title feat(mango): add allow_fallback for user-specified indexes on _find feat(mango): add allow_fallback to control falling back to other indexes on _find Oct 8, 2024
@rnewson
Copy link
Member

rnewson commented Oct 8, 2024

Hi,

_find was briefly discussed in the last couchdb meeting.

I've personally gone off the idea of yet more flags inside the /dbname/_find request body that intend to coerce it to make fewer bad decisions, I would not like to see it happen without a new developer vote.

I see a fork in the road for /dbname/_find. It takes one of two forks as I see it;

  1. CouchDB's indexing will get richer, or mango will get smarter (capable of using multiple indexes for a query, using information from one to more efficiently skip through another). In other words, to become what it looks like it is on the outside, but is not on the inside.

  2. It continues to be just a naive selector between existing indexes, albeit indexes with the very significant benefit of not requiring a javascript callout.

If we take the first path, then neither the new allow_fallback parameter or the existing use_index parameter would be necessary, we'd gladly deprecate them.

If we take the second path, then we're making a bad idea worse.

I would prefer to add _find at the same url places where index-specific queries were performed prior to Mango. i.e, /dbname/_design/ddocname/_view/viewname and /dbname/_design/ddocname/_search/indexname. Requests to those endpoints will take a json request body, treat it as a selector, and either reject with a 400 Bad Request if the index is unsuitable, or proceed with the query against the index in the URL. use_index in that request body would generate a 400 Bad Request.

@pgj
Copy link
Contributor Author

pgj commented Oct 8, 2024

I was under the impression that allow_fallback is a reasonable compromise to treat these problems, especially it was the one you proposed and there seem to be an agreement around it. That said, I do not agree that it would make things worse. I really like that it goes well with the previously established principles of the top-level _find endpoint.

I believe that _find shall always be declarative in nature and as such, making it work like _search makes less sense. If users need something that Search can do, they can always choose to use that instead.

Now we have a tool at hand at least that can save users from running badly planned queries, which could be useful even after the enhancements you suggested in the first point above have been implemented. We could always deprecate it later, once we have a better view at which one of the branches has been eventually taken or it is deemed to be unnecessary.

Copy link
Member

@garrensmith garrensmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement that will improve the mango user experience.

@jaydoane
Copy link
Contributor

This is a nice improvement that will improve the mango user experience.

👋 @garrensmith!

@nickva
Copy link
Contributor

nickva commented Oct 16, 2024

I liked @rnewson's idea of a very simple/predictable view that uses the selector syntax instead of javascript so tried this: #5310. It's even simpler than proposed, as it just uses the _design/.../_view/... pattern that we know (and love?).

As for this particular PR, I think it's fine. We already got _find with it's good/bad parts, so adding more control over index selection seem reasonable.

As for the future of _find, it's hard to say. Will we get a super-duper query planner like Postgres? Probably not any time soon. But I could see adding incremental improvements such as being able to merge two or more indexes to get the $or working properly (we already merge index results in fabric anyway!), so wouldn't like to preemptively neuter _find just yet.

Anyway just my $0.02.

Also 👋 to @garrensmith!

It is not always beneficial for the performance if the Mango query
planner tries to assign an index to the selector.  User-specified
indexes may save the day, but since they are only hints for the
planner, fallbacks may still happen.

Introduce the `allow_fallback` flag which can be used to tell if
falling back to other indexes is acceptable when an index is
explicitly specified by the user.  When set to `false`, give up
on planning and return an HTTP 400 response right away.  This way
the user has the chance to learn about the requested but missing
index, optionally create it and try again.

By default, fallbacks are allowed to maintain backwards
compatibility.  It is possible to set `allow_fallback` to `true`
but currently it coincides with the default behavior hence becomes
a no-op in practice.

Fixes apache#4511
The `binary:split/2` function will split the input only at the
first occurrence of the pattern which is not suitable for
identifying each part of the index name, therefore validating it.
Add the `global` option to fully break the index name into parts
and let the related validation logic work as expected.
@pgj pgj force-pushed the feat/mango/find/use_index_allow_fallback branch from 9c33dc4 to c92c389 Compare October 18, 2024 13:53
@rnewson
Copy link
Member

rnewson commented Oct 18, 2024

@pgj had a chat just now to go over our differences. I'm happy to see allow_fallback:true|false merged, and will do when CI finishes successfully.

I do want us to follow up with either the /db/_design/foo/_find/bar or /db/_design/foo/_view/bar that takes a selector, as I think those are easier to advocate for, and to see in logs and client libraries, to help users migrate to a safer place where they can't accidentally fall into seriously suboptimal query performance.

I also think allow_fallback should default to false in a future CouchDB major release.

@rnewson rnewson merged commit 4388283 into apache:main Oct 18, 2024
24 checks passed
@pgj pgj deleted the feat/mango/find/use_index_allow_fallback branch October 18, 2024 15:28
@pgj
Copy link
Contributor Author

pgj commented Oct 18, 2024

Thank everyone who participated in the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mango queries should not use other index when use_index is specified
7 participants