Skip to content

fix(nh-spec): update histogram_quantile and histogram_fraction docs #2676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions docs/specs/native_histograms.md
Original file line number Diff line number Diff line change
Expand Up @@ -1933,10 +1933,17 @@ the minimum observation will often be in the lowest bucket.
`histogram_quantile` treats observations of value `NaN` (which SHOULD NOT
happen, see [above](#special-cases-of-observed-values)) effectively as
observations of `+Inf`. This follows the rationale that `NaN` is never less
than any value that `histogram_quantile` returns and is consistent with how
classic histograms usually treat `NaN` observations (which end up in the `+Inf`
bucket in most implementations). (TODO: The correct implementation of this
behavior still needs to be verified by tests.)
than any value that `histogram_quantile` returns. As long as the quantile
falls into an existing bucket we return the quantile calculated as if `NaN`
observations were observed as `+Inf` and also issue an info level annotation
to let the user know that results are skewed due to `NaN`. This is consistent
with how classic histograms usually treat `NaN` observations (which end up in
the `+Inf` bucket in most implementations). If the quantile falls above all
existing buckets, we return `NaN` due to the fact that we want to be consistent
with `histogram_fraction`, since `histogram_fraction` considers `NaN`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe turn this comma into a semicolon or period? It's hard to dissect this sentence currently.

observations to be outside any interval, `NaN` observations cannot be
considered in the `+Inf` bucket. We also return an info level annotation
specific to this case.

The following functions have been introduced specifically for native
histograms:
Expand Down Expand Up @@ -1985,8 +1992,7 @@ aligned with the bucket boundaries in the histogram. `+Inf` and `-Inf` are
valid boundary values and useful to estimate the fraction of all observations
above or below a certain value. However, observations of value `NaN` are always
considered to be outside of the specified boundaries (even `+Inf` and `-Inf`).
(TODO: Verify the correct implementation of this behavior with tests.) Whether
the provided boundaries are inclusive or exclusive is only relevant if the
Whether the provided boundaries are inclusive or exclusive is only relevant if the
provided boundaries are precisely aligned with bucket boundaries in the
underlying native histogram. In this case, the behavior depends on the precise
definition of the schema of the histogram.
Expand Down