Open
Description
Things to do:
- treat
missing
as a special value that is not pooled, probably with level0
. This would work the same as in CategoricalArrays.jl; the benefit is that twoPooledArrays
differing only in the fact if they allowMissing
or not could share pool - add locking for
setindex!
but make sure that we support batch operations of adding levels (both insetindex!
and in e.g.copyto!
); this will allow to fully drop Copy-On-Write and never copy pool and invpool by default; tentativelyunsafe_setindex!
would be an alternative that does not use lock - stress in documentation that using
invpool
is not safe if potentially other threads are modifying it (this should not be a problem) - add
droplevels!
to DataAPI.jl and to PooledArrays.jl (this requires also a change in CategoricalArrays.jl); this function would reduce pool and invpool to only used levels and also at the same time make a fresh copy of them (as a way to detach pool and invpool between PooledArray-s)
I think this design is better than global pool. It will still cost us a bit in H2O benchmarks, but at least we avoid a global pool that is not reclaimable.
@nalimilan + @quinnj : any additional comments on this?
Metadata
Metadata
Assignees
Labels
No labels