Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matchtree: capture Stats before pruning #607

Merged
merged 1 commit into from
Jun 29, 2023
Merged

matchtree: capture Stats before pruning #607

merged 1 commit into from
Jun 29, 2023

Conversation

keegancsmith
Copy link
Member

We now call updateStats upto twice per shard search. The intention of this is to capture statistics before pruning the matchtree. Previously we would of done work in creating a matchtree but would then prune those items away and would then never capture those statistics.

In practice that work was reading just one or two varint (the size of a posting list) so likely had minimal impact on the reported statistics. However, in the next commit we want to introduce a statistic which is recorded even if we generate a noMatchTree.

The main technical part of this commit is ensuring all existing updateStats functions can be called twice without overcounting.

Test Plan: go test

Part of https://github.com/sourcegraph/sourcegraph/issues/54950

We now call updateStats upto twice per shard search. The intention of
this is to capture statistics before pruning the matchtree. Previously
we would of done work in creating a matchtree but would then prune those
items away and would then never capture those statistics.

In practice that work was reading just one or two varint (the size of a
posting list) so likely had minimal impact on the reported statistics.
However, in the next commit we want to introduce a statistic which is
recorded even if we generate a noMatchTree.

The main technical part of this commit is ensuring all existing
updateStats functions can be called twice without overcounting.

Test Plan: go test
@@ -150,6 +154,7 @@ func (i *ngramDocIterator) prepare(nextDoc uint32) {
func (i *ngramDocIterator) updateStats(s *Stats) {
i.iter.updateStats(s)
s.NgramMatches += i.matchCount
i.matchCount = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch to reset this

Copy link
Contributor

@gl-srgr gl-srgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; I was curious about the other iterators not in this PR and it looks like this covers all of the relevant updateStats() implementations.

@keegancsmith keegancsmith merged commit 93f7b0c into main Jun 29, 2023
@keegancsmith keegancsmith deleted the k/ngram-stats branch June 29, 2023 10:17
Copy link
Member

@stefanhengl stefanhengl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants