Skip to content

Commit

Permalink
make dev
Browse files Browse the repository at this point in the history
  • Loading branch information
johnkerl committed Aug 26, 2023
1 parent 44e3a62 commit fb3e3d1
Show file tree
Hide file tree
Showing 5 changed files with 65 additions and 49 deletions.
28 changes: 16 additions & 12 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -2652,7 +2652,7 @@ MILLER(1) MILLER(1)
mean([4,5,7,10]) is 6.5

1mmeaneb0m
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types.
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
meaneb([4,5,7,10]) is 1.3228756

Expand Down Expand Up @@ -2731,8 +2731,7 @@ MILLER(1) MILLER(1)
(class=stats #args=2,3) Returns the given percentiles of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. See examples for information on the three option flags.
Examples:

Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort
the input before computing percentiles:
Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort the input before computing percentiles:

percentiles([3,4,5,6,9,10], [25,75]) is { "25": 4, "75": 9 }
percentiles(["abc", "def", "ghi", "ghi"], [25,75]) is { "25": "def", "75": "ghi" }
Expand All @@ -2741,36 +2740,41 @@ MILLER(1) MILLER(1)

percentiles([3,4,5,6,9,10], [25,75], {"output_array_not_map":true}) is [4, 9]

Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces
,error on string inputs:
Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces error values on string inputs:

percentiles([3,4,5,6,9,10], [25,75], {"interpolate_linearly":true}) is { "25": 4.25, "75": 8.25 }

The percentiles function always sorts its inputs before computing percentiles. If you know your input
is already sorted -- see also the sort_collection function -- then computation will be faster on
large input if you pass in "array_is_sorted":
The percentiles function always sorts its inputs before computing percentiles. If you know your input is already sorted -- see also the sort_collection function -- then computation will be faster on large input if you pass in "array_is_sorted" (shorthand: "ais":

x = [6,5,9,10,4,3]
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 5, "75": 4 } which is incorrect
percentiles(x, [25,75], {"ais":true}) gives { "25": 5, "75": 4 } which is incorrect
x = sort_collection(x)
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 4, "75": 9 } which is correct
percentiles(x, [25,75], {"ais":true}) gives { "25": 4, "75": 9 } which is correct

You can also leverage this feature to compute percentiles on a sort of your choosing. For example:

Non-sorted input:

x = splitax("the quick brown fox jumped loquaciously over the lazy dogs", " ")
x is: ["the", "quick", "brown", "fox", "jumped", "loquaciously", "over", "the", "lazy", "dogs"]
Percentiles are taken over the original positions of the words in the array -- "dogs" is last
and hence appears as p99:

Percentiles are taken over the original positions of the words in the array -- "dogs" is last and hence appears as p99:

percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "dogs"]

With sorting done inside percentiles, "the" is alphabetically last and is therefore the p99:

percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

With default sorting done outside percentiles, the same:

x = sort(x) # or x = sort_collection(x)
x is: ["brown", "dogs", "fox", "jumped", "lazy", "loquaciously", "over", "quick", "the", "the"]
percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "the"]
percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

Now sorting by word length, "loquaciously" is longest and hence is the p99:

x = sort(x, func(a,b) { return strlen(a) <=> strlen(b) } )
x is: ["fox", "the", "the", "dogs", "lazy", "over", "brown", "quick", "jumped", "loquaciously"]
percentiles(x, [50, 99], {"oa":true, "ais":true})
Expand Down
28 changes: 16 additions & 12 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2631,7 +2631,7 @@ MILLER(1) MILLER(1)
mean([4,5,7,10]) is 6.5

1mmeaneb0m
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types.
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
meaneb([4,5,7,10]) is 1.3228756

Expand Down Expand Up @@ -2710,8 +2710,7 @@ MILLER(1) MILLER(1)
(class=stats #args=2,3) Returns the given percentiles of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. See examples for information on the three option flags.
Examples:

Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort
the input before computing percentiles:
Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort the input before computing percentiles:

percentiles([3,4,5,6,9,10], [25,75]) is { "25": 4, "75": 9 }
percentiles(["abc", "def", "ghi", "ghi"], [25,75]) is { "25": "def", "75": "ghi" }
Expand All @@ -2720,36 +2719,41 @@ MILLER(1) MILLER(1)

percentiles([3,4,5,6,9,10], [25,75], {"output_array_not_map":true}) is [4, 9]

Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces
,error on string inputs:
Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces error values on string inputs:

percentiles([3,4,5,6,9,10], [25,75], {"interpolate_linearly":true}) is { "25": 4.25, "75": 8.25 }

The percentiles function always sorts its inputs before computing percentiles. If you know your input
is already sorted -- see also the sort_collection function -- then computation will be faster on
large input if you pass in "array_is_sorted":
The percentiles function always sorts its inputs before computing percentiles. If you know your input is already sorted -- see also the sort_collection function -- then computation will be faster on large input if you pass in "array_is_sorted" (shorthand: "ais":

x = [6,5,9,10,4,3]
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 5, "75": 4 } which is incorrect
percentiles(x, [25,75], {"ais":true}) gives { "25": 5, "75": 4 } which is incorrect
x = sort_collection(x)
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 4, "75": 9 } which is correct
percentiles(x, [25,75], {"ais":true}) gives { "25": 4, "75": 9 } which is correct

You can also leverage this feature to compute percentiles on a sort of your choosing. For example:

Non-sorted input:

x = splitax("the quick brown fox jumped loquaciously over the lazy dogs", " ")
x is: ["the", "quick", "brown", "fox", "jumped", "loquaciously", "over", "the", "lazy", "dogs"]
Percentiles are taken over the original positions of the words in the array -- "dogs" is last
and hence appears as p99:

Percentiles are taken over the original positions of the words in the array -- "dogs" is last and hence appears as p99:

percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "dogs"]

With sorting done inside percentiles, "the" is alphabetically last and is therefore the p99:

percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

With default sorting done outside percentiles, the same:

x = sort(x) # or x = sort_collection(x)
x is: ["brown", "dogs", "fox", "jumped", "lazy", "loquaciously", "over", "quick", "the", "the"]
percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "the"]
percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

Now sorting by word length, "loquaciously" is longest and hence is the p99:

x = sort(x, func(a,b) { return strlen(a) <=> strlen(b) } )
x is: ["fox", "the", "the", "dogs", "lazy", "over", "brown", "quick", "jumped", "loquaciously"]
percentiles(x, [50, 99], {"oa":true, "ais":true})
Expand Down
2 changes: 1 addition & 1 deletion docs/src/reference-dsl-builtin-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1030,7 +1030,7 @@ mean([4,5,7,10]) is 6.5

### meaneb
<pre class="pre-non-highlight-non-pair">
meaneb (class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types.
meaneb (class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
meaneb([4,5,7,10]) is 1.3228756
</pre>
Expand Down
28 changes: 16 additions & 12 deletions man/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2631,7 +2631,7 @@ MILLER(1) MILLER(1)
mean([4,5,7,10]) is 6.5

1mmeaneb0m
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types.
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
meaneb([4,5,7,10]) is 1.3228756

Expand Down Expand Up @@ -2710,8 +2710,7 @@ MILLER(1) MILLER(1)
(class=stats #args=2,3) Returns the given percentiles of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. See examples for information on the three option flags.
Examples:

Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort
the input before computing percentiles:
Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort the input before computing percentiles:

percentiles([3,4,5,6,9,10], [25,75]) is { "25": 4, "75": 9 }
percentiles(["abc", "def", "ghi", "ghi"], [25,75]) is { "25": "def", "75": "ghi" }
Expand All @@ -2720,36 +2719,41 @@ MILLER(1) MILLER(1)

percentiles([3,4,5,6,9,10], [25,75], {"output_array_not_map":true}) is [4, 9]

Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces
,error on string inputs:
Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces error values on string inputs:

percentiles([3,4,5,6,9,10], [25,75], {"interpolate_linearly":true}) is { "25": 4.25, "75": 8.25 }

The percentiles function always sorts its inputs before computing percentiles. If you know your input
is already sorted -- see also the sort_collection function -- then computation will be faster on
large input if you pass in "array_is_sorted":
The percentiles function always sorts its inputs before computing percentiles. If you know your input is already sorted -- see also the sort_collection function -- then computation will be faster on large input if you pass in "array_is_sorted" (shorthand: "ais":

x = [6,5,9,10,4,3]
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 5, "75": 4 } which is incorrect
percentiles(x, [25,75], {"ais":true}) gives { "25": 5, "75": 4 } which is incorrect
x = sort_collection(x)
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 4, "75": 9 } which is correct
percentiles(x, [25,75], {"ais":true}) gives { "25": 4, "75": 9 } which is correct

You can also leverage this feature to compute percentiles on a sort of your choosing. For example:

Non-sorted input:

x = splitax("the quick brown fox jumped loquaciously over the lazy dogs", " ")
x is: ["the", "quick", "brown", "fox", "jumped", "loquaciously", "over", "the", "lazy", "dogs"]
Percentiles are taken over the original positions of the words in the array -- "dogs" is last
and hence appears as p99:

Percentiles are taken over the original positions of the words in the array -- "dogs" is last and hence appears as p99:

percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "dogs"]

With sorting done inside percentiles, "the" is alphabetically last and is therefore the p99:

percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

With default sorting done outside percentiles, the same:

x = sort(x) # or x = sort_collection(x)
x is: ["brown", "dogs", "fox", "jumped", "lazy", "loquaciously", "over", "quick", "the", "the"]
percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "the"]
percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

Now sorting by word length, "loquaciously" is longest and hence is the p99:

x = sort(x, func(a,b) { return strlen(a) <=> strlen(b) } )
x is: ["fox", "the", "the", "dogs", "lazy", "over", "brown", "quick", "jumped", "loquaciously"]
percentiles(x, [50, 99], {"oa":true, "ais":true})
Expand Down
28 changes: 16 additions & 12 deletions man/mlr.1
Original file line number Diff line number Diff line change
Expand Up @@ -3962,7 +3962,7 @@ mean([4,5,7,10]) is 6.5
.RS 0
.\}
.nf
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types.
(class=stats #args=1) Returns the error bar for arithmetic mean of values in an array or map, assuming the values are independent and identically distributed. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
meaneb([4,5,7,10]) is 1.3228756
.fi
Expand Down Expand Up @@ -4131,8 +4131,7 @@ percentile(["abc", "def", "ghi", "ghi"], 90) is "ghi"
(class=stats #args=2,3) Returns the given percentiles of values in an array or map. Returns empty string AKA void for empty array/map; returns error for non-array/non-map types. See examples for information on the three option flags.
Examples:

Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort
the input before computing percentiles:
Defaults are to not interpolate linearly, to produce a map keyed by percentile name, and to sort the input before computing percentiles:

percentiles([3,4,5,6,9,10], [25,75]) is { "25": 4, "75": 9 }
percentiles(["abc", "def", "ghi", "ghi"], [25,75]) is { "25": "def", "75": "ghi" }
Expand All @@ -4141,36 +4140,41 @@ Use "output_array_not_map" (or shorthand "oa") to get the outputs as an array:

percentiles([3,4,5,6,9,10], [25,75], {"output_array_not_map":true}) is [4, 9]

Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces
,error on string inputs:
Use "interpolate_linearly" (or shorthand "il") to do linear interpolation -- note this produces error values on string inputs:

percentiles([3,4,5,6,9,10], [25,75], {"interpolate_linearly":true}) is { "25": 4.25, "75": 8.25 }

The percentiles function always sorts its inputs before computing percentiles. If you know your input
is already sorted -- see also the sort_collection function -- then computation will be faster on
large input if you pass in "array_is_sorted":
The percentiles function always sorts its inputs before computing percentiles. If you know your input is already sorted -- see also the sort_collection function -- then computation will be faster on large input if you pass in "array_is_sorted" (shorthand: "ais":

x = [6,5,9,10,4,3]
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 5, "75": 4 } which is incorrect
percentiles(x, [25,75], {"ais":true}) gives { "25": 5, "75": 4 } which is incorrect
x = sort_collection(x)
percentiles(x, [25,75], {"array_is_sorted":true}) gives { "25": 4, "75": 9 } which is correct
percentiles(x, [25,75], {"ais":true}) gives { "25": 4, "75": 9 } which is correct

You can also leverage this feature to compute percentiles on a sort of your choosing. For example:

Non-sorted input:

x = splitax("the quick brown fox jumped loquaciously over the lazy dogs", " ")
x is: ["the", "quick", "brown", "fox", "jumped", "loquaciously", "over", "the", "lazy", "dogs"]
Percentiles are taken over the original positions of the words in the array -- "dogs" is last
and hence appears as p99:

Percentiles are taken over the original positions of the words in the array -- "dogs" is last and hence appears as p99:

percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "dogs"]

With sorting done inside percentiles, "the" is alphabetically last and is therefore the p99:

percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

With default sorting done outside percentiles, the same:

x = sort(x) # or x = sort_collection(x)
x is: ["brown", "dogs", "fox", "jumped", "lazy", "loquaciously", "over", "quick", "the", "the"]
percentiles(x, [50, 99], {"oa":true, "ais":true}) gives ["loquaciously", "the"]
percentiles(x, [50, 99], {"oa":true}) gives ["loquaciously", "the"]

Now sorting by word length, "loquaciously" is longest and hence is the p99:

x = sort(x, func(a,b) { return strlen(a) <=> strlen(b) } )
x is: ["fox", "the", "the", "dogs", "lazy", "over", "brown", "quick", "jumped", "loquaciously"]
percentiles(x, [50, 99], {"oa":true, "ais":true})
Expand Down

0 comments on commit fb3e3d1

Please sign in to comment.