-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import StatsBase into Statistics #2
Draft
nalimilan
wants to merge
349
commits into
StatsBase2021
Choose a base branch
from
nl/weightedstats
base: StatsBase2021
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…an AbstractArray (#5810)
still not clear what to do with floating point in StepRange
It turns out there are many cases that are drastically easier this way. Tests not passing yet, but getting there.
And add a simple test for hist()
Also update test suite to follow suit
Also split out parseint loops in strings.jl to better separate out exception types
Also some minor reorganization
- Implement sum(fn, A, region) - Implement Base.sumabs across dimensions - Use pairwise summation/BLAS for sums across first dimension
Also fix a bounds violation when computing the variance of a vector with a single element
* relax type definition of middle this adds support for computing the median of unitful types, see PainterQubits/Unitful.jl#202 * updates docs and add tests for middle on non-reals
As in `cor` we get square root I think it is safe to assume that the result should be floating point. An example of current surprising behavior: ``` julia> cor([im]) true ```
When `sort=false`, we only partially sort the input, so `NaN`/`missing` is not guaranteed to be in the last position. Also avoid throwing errors for non-`Number` types, for which `isnan` may not be defined.
There is no reliable way to know only from the array eltype whether entries support `isnan` or not. Better leave to the compiler to optimize out the `isa Number` check when possible.
Great effort guys! I'd love to see the StatsBase functionality in the Statistics stdlib. What about weighted sampling (https://juliastats.org/StatsBase.jl/stable/sampling/#Sampling-from-Population-1). Will this go into Statistics as well? (It also has semantic overlap with Random I guess.) |
nalimilan
force-pushed
the
nl/weightedstats
branch
2 times, most recently
from
September 25, 2021 15:38
1277ff9
to
71cde77
Compare
nalimilan
changed the title
Import weighted stats and moments from StatsBase to Statistics
Import StatsBase into Statistics
Sep 25, 2021
nalimilan
force-pushed
the
nl/weightedstats
branch
from
September 25, 2021 22:10
b3e9325
to
29b230f
Compare
nalimilan
force-pushed
the
nl/weightedstats
branch
from
September 26, 2021 19:09
29b230f
to
020a810
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Equivalent of JuliaLang/julia#31395. See original discussions there. See #87 for current design discussion.
This PR is against the StatsBase branch, as the idea is to port all features we want from StatsBase to it before we clean up the rest (and possibly purge the history from features we don't want). Only then we'll be able to merge it with a clean history (including StatsBase's) into master. (I created the StatsBase branch using
git merge master --allow-unrelated-histories -s ours
after fetching the StatsBase history from its repo.)Progress:
wsum
moment
mean_and_var
,mean_and_std
,zscore
/zscore!
, mergednquantile
withquantile
mean_and_cov
; skipscattermat
?