Skip to content

Conversation

@4ertus2
Copy link

@4ertus2 4ertus2 commented Jul 14, 2025

Decimal could be represented as FixedLengthByteArrays or ByteArrays in parquet files. FixedLengthByteArrays and ByteArrays could be encoded with DeltaByteArray encoding.

PR adds support for reading such Decimal columns.

liujiayi771 and others added 7 commits July 14, 2025 07:44
fix decimal avg function precision issue

Alchemy-item: [[6020] Spark sql avg agg function support decimal](oap-project#511 (comment)) commit 1/1 - 475f069
Alchemy-item: [[oap] Register merge extract companion agg functions without suffix](oap-project#512 (comment)) commit 1/1 - c1e41a3
Signed-off-by: Yuan <yuanzhou@apache.org>

Alchemy-item: [[11771] [11772] Fix smj result mismatch issue](oap-project#514 (comment)) commit 1/1 - 8c77615
Alchemy-item: [[11067] Support scan filter for decimal in ORC](oap-project#513 (comment)) commit 1/1 - 60fce69
Alchemy-item: [ [5962] Support struct schema evolution matching by name](oap-project#805 (comment)) commit 1/1 - e3d77fd
…rtType

Alchemy-item: [[13620] fix: Add config for requested type check in ReaderBase::conve…](oap-project#855 (comment)) commit 1/1 - 924fc2e
@prestodb-ci prestodb-ci force-pushed the update branch 4 times, most recently from 62f96b6 to ec7dc4e Compare July 18, 2025 00:45
@zhouyuan
Copy link
Collaborator

zhouyuan commented Jul 18, 2025

Hi @4ertus2 this is a good feature to have, would you please submit this patch to Meta/Velox instead? OAP/Velox is more like a mirror and we usually dont do PR review/comments here. Thanks.

Cc @rui-mo

@4ertus2
Copy link
Author

4ertus2 commented Jul 20, 2025

There're some legal restrictions to do it in Russia. It would be great if you merge it in your fork first. So it would be published in repo that has no direct relations with Meta.

Then you are able to share it to any other repo on your own. I'm not interested in direct commit into Meta's repo.

@prestodb-ci prestodb-ci force-pushed the update branch 3 times, most recently from 7ea7352 to 151e9ef Compare July 22, 2025 00:45
@rui-mo
Copy link
Collaborator

rui-mo commented Jul 22, 2025

@4ertus2 Thanks for extending the support of decimal reader.

FixedLengthByteArrays and ByteArrays could be encoded with DeltaByteArray encoding.

Would you mind providing a test with this type of Parquet file? It could be added in the https://github.com/oap-project/velox/blob/update/velox/dwio/parquet/tests/reader/ParquetTableScanTest.cpp. Thanks.

@4ertus2
Copy link
Author

4ertus2 commented Jul 22, 2025

I cannot share samples I tested on. They have confidential info inside.

I also asked for a sample here apache/parquet-testing#89 But there's no-one who want to help with it yet.

@prestodb-ci prestodb-ci force-pushed the update branch 2 times, most recently from 351ef91 to 6e833ba Compare July 24, 2025 00:45
@rui-mo
Copy link
Collaborator

rui-mo commented Jul 24, 2025

@4ertus2 Thanks for sharing the context.

@prestodb-ci prestodb-ci force-pushed the update branch 6 times, most recently from 344fde6 to 7a31679 Compare July 30, 2025 04:45
@prestodb-ci prestodb-ci force-pushed the update branch 9 times, most recently from c5d9aac to 9f0ef60 Compare October 26, 2025 04:46
@jinchengchenghh jinchengchenghh force-pushed the update branch 2 times, most recently from 704b7d6 to 9f74a76 Compare October 29, 2025 10:18
@prestodb-ci prestodb-ci force-pushed the update branch 9 times, most recently from 07f098d to 537e695 Compare November 4, 2025 04:45
@prestodb-ci prestodb-ci force-pushed the update branch 3 times, most recently from e963905 to b228025 Compare November 7, 2025 11:21
@prestodb-ci prestodb-ci force-pushed the update branch 5 times, most recently from cc6eb06 to 7dfc0d2 Compare November 15, 2025 04:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants