Description
With the perspective of representing missing data as Union{T, Null}
in Julia 0.7, we should decide what will happen to Nullable
. I think the consensus is that being container-like, Nullable
is appropriate to represent "the software engineer's null", as opposed to "the data analyst's null", a.k.a. missing values. In other words, Nullable
offers three properties which Union{T, Null}
does not:
- It does not automatically propagate missingness.
- It requires explicit handling of the possibility that a value may be null, even when it isn't (contrary to
Union{T, Null}
, where code may work when a value is of typeT
but not when it is of typeNull
, which might not have been properly anticipated/tested). - It allows distinguishing
Nullable{Nullable{T}}
fromNullable{T}
(contrary toUnion{Union{T, Null}, Null} == Union{T, Null}
), which is useful when you need to make the difference between "no value" and "null value". Such situations arise e.g. when doing a dictionary lookup (tryget
, cf. Get dict value as a nullable #13055 and Add get(coll, key) for Associative returning Nullable #18211); when parsing a string viatryparse
to a value which could either be of typeT
, null, or invalid; or when wrapping a value which could either be of typeT
or null in aNullable
before returning it from a function.
The two first features are the ones which turned out to be annoying when working with missing data, but which can provide additional safety for general programming. A detailed discussion of the advantages and drawbacks of these approaches can be found in the Nullable
Julep.
Given that, several paths can be taken for Nullable
in Julia 0.7:
- Make
Nullable{T}
a (deprecated) type alias forUnion{T, Null}
. This would have the advantage that Julia would have a single concept of null/missing values, but without the advantages of the three points above. Checks that code is correctly prepared to handle null values could still be done by a linter. - Make
Nullable{T}
a (deprecated) type alias forUnion{Some{T}, Null}
, withSome{T}
a wrapper around a value of typeT
which would behave essentially likeNullable{T}
currently. Applying a function on the value would require usingf.(x)
,broadcast
or pattern matching, so that missingness would never propagate without explicitly asking for it. The advantages would be those of the three points above, at the cost of two different representations of missingness (or almost two, since theNull
type would be used in both cases). - Deprecate
Nullable
in Base and move it as-is to a package. A possible variant would be to rename it toOption
in order to prevent confusion withNull
and be more consistent with other languages like Scala, Rust or Swift (IIRC,Nullable
was originally calledOption
and lived in the Options.jl package). The main advantage of this approach is to have a single representation of null values in Base, in particular to avoid setting the design ofNullable
in stone in 1.0. The main issue is that no code would be able to useNullable
in Base, which implies in particular changingtryparse
to returnUnion{T, Null}
, and that no correcttryget
method could be implemented for dicts (Get dict value as a nullable #13055). OTOH this could help increasing consistency with e.g.match
, which returnsUnion{T, Void}
, and uses ofNullable
are not so widespread in Base.
EDIT: added mention of point 3. in the first series of bullets.