perf[vortex-array]: don't materialize const arrays in to_arrow_dictionary#6265
perf[vortex-array]: don't materialize const arrays in to_arrow_dictionary#6265joseph-isaacs merged 1 commit intodevelopfrom
Conversation
There was a problem hiding this comment.
Can we add a test that 1 its correct (equal to the expected value) and 2 that the values count of the array dict is 1
| Ok(dict) => return dict_to_dict(dict, codes_type, values_type, ctx), | ||
| Err(array) => array, | ||
| }; | ||
| let array = match array.try_into::<ConstantVTable>() { |
There was a problem hiding this comment.
Is there a better way to match on vtables? The other thing I though of doing is an if/else chain on array.is
There was a problem hiding this comment.
Not now, but we should add this.
joseph-isaacs
left a comment
There was a problem hiding this comment.
happy to merge once you have those tests
|
TFTR! Will work on adding some tests |
933e1fc to
3484ff1
Compare
Merging this PR will degrade performance by 32.7%
Performance Changes
Comparing Footnotes
|
| Ok(()) | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
Mind writing these test in terms of assert_eq(expected, actual)
#[test]
fn dict_to_dict_with_null_codes() -> VortexResult<()> {
let dict = DictArray::try_new(
PrimitiveArray::from_option_iter(vec![Some(0u8), None, Some(1)]).into_array(),
VarBinViewArray::from_iter_str(["a", "b"]).into_array(),
)?;
let actual = execute(dict.into_array(), &dict_type(DataType::UInt8, DataType::Utf8))?;
let expected: DictionaryArray<UInt8Type> =
vec![Some("a"), None, Some("b")].into_iter().collect();
assert_eq!(expected, actual.as_dictionary::<UInt8Type>().clone());
Ok(())
}
There was a problem hiding this comment.
Its less code and tests more stuff
There was a problem hiding this comment.
Good point, will also use rstest
3484ff1 to
3e0dd74
Compare
…nary Executing a Constant array into an arrow dictionary would materialize the full array and then perform the cast. Since we only have one value, we can just cast that value and create the codes array. This results in 15-20% performance improvement on our real-world queries. Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
3e0dd74 to
00403b6
Compare
…nary (#6265) Executing a Constant array into an arrow dictionary would materialize the full array and then perform the cast. Since we only have one value, we can just cast that value and create the codes array. This results in a 15% performance improvement on our real-world queries. Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
Executing a Constant array into an arrow dictionary would materialize the full array and then perform the cast. Since we only have one value, we can just cast that value and create the codes array. This results in a 15% performance improvement on our real-world queries.