Skip to content

Conversation

@TomAugspurger
Copy link
Contributor

This fixes an issue in the old factorization method, which didn't properly account for missing values. Basically

[B, B, NA, NA, A, B]

Should factorize as [0, 0, -1, -1, 1, 0]. Previously, we didn't handle NA so it was [0, 0, 1, 1, 2, 0].

Numba gave a 285x speedup (after JIT warmup) on a benchmark with 10,000 values.

@TomAugspurger TomAugspurger merged commit df6853a into ContinuumIO:master Mar 14, 2018
@TomAugspurger TomAugspurger deleted the factorize-fix branch March 14, 2018 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant