Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion#68
Conversation
a6dd5a3 to
0adc99b
Compare
including a filesystem prefix.
There was a problem hiding this comment.
Add doc strings for these new public methods.
6f39327 to
dcbe683
Compare
There was a problem hiding this comment.
consider setHadoopFileSelector(hadoopFileSelector: HadoopFileSelector) and unsetHadoopFileSelector(): Unit { hadoopFileSelector = None }
There was a problem hiding this comment.
Why? One method means less code to write and maintain.
There was a problem hiding this comment.
It just looks a little odd to me to set using an Option -- i.e. to setHadoopFileSelector(maybeAHadoopFileSelector) -- instead of to set with an actual instance and to explicitly clear instead of to set to None. I guess what I am saying is that it makes sense for the underlying this.hadoopFileSelector to be an Option (maybe there, maybe not), but that when setting or removing the hadoopFileSelector the caller of the method(s) would naturally have a concrete idea of what should be done and wrapping that concreteness in a maybe doesn't make obvious sense or improve the readability at the callsite of the set/unset.
There was a problem hiding this comment.
You could also separate this into two cases, which may make the code maintenance with upstream changes a little easier.
case oi: HiveVarcharObjectInspector if emptyStringsAsNulls => ...
case oi: HiveVarcharObjectInspector =>
(value: Any, row: MutableRow, ordinal: Int) =>
row.setString(ordinal, oi.getPrimitiveJavaObject(value).getValue)Add hooks for selecting the set of files for a table scan; also add an option for empty string -> null conversion
@markhamstra