Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extension hooks for encoding and decoding UDAFs and UDWFs #11417

Merged
merged 2 commits into from
Jul 16, 2024

Conversation

joroKr21
Copy link
Contributor

@joroKr21 joroKr21 commented Jul 11, 2024

Which issue does this PR close?

Closes #11422.

Rationale for this change

The rationale for this change is to be able to run parameterized aggregation UDFs in a distributed architecture.
E.g. one use case is implementing an sample UDF (akin to array_agg but with a limit on the number of elements collected).

What changes are included in this PR?

  • Extend the interface of LogicalExtensionCodec to handle aggregation UDFs and window UDFs.
  • Extend the interface of PhysicalExtensionCodec to handle aggregation UDFs.
    • I couldn't figure out physical window UDFs so those are omitted.
  • Handle ignore_nulls and distinct flags better.

Are these changes tested?

Yes, added roundtrip tests for encoding and decoding.

Are there any user-facing changes?

Yes, users can now override the new encoding and decoding functions.

Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@thinkharderdev thinkharderdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@@ -239,53 +240,52 @@ struct ComposedPhysicalExtensionCodec {
codecs: Vec<Arc<dyn PhysicalExtensionCodec>>,
}

impl ComposedPhysicalExtensionCodec {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @lewiszlw

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb
Copy link
Contributor

alamb commented Jul 13, 2024

Thanks for the reviews @thinkharderdev and @avantgardnerio -- @joroKr21 it appears this branch has conflicts that need to be resolved. Once those are sorted out I think we can merge this PR

@joroKr21
Copy link
Contributor Author

@alamb do you get this issue locally on main?

error[E0432]: unresolved import `chrono::MappedLocalTime`
  --> datafusion/functions/src/datetime/to_local_time.rs:34:24
   |
34 | use chrono::{DateTime, MappedLocalTime, Offset, TimeDelta, TimeZone, Utc};
   |                        ^^^^^^^^^^^^^^^ no `MappedLocalTime` in the root

@lewiszlw
Copy link
Member

I got this error before, running cargo update fixed it.

@alamb
Copy link
Contributor

alamb commented Jul 16, 2024

I got this error before, running cargo update fixed it.

🤔 looks like we probably need to increase the minimum version of chrono required to one that has MappedLocalTime

@alamb alamb merged commit 0232699 into apache:main Jul 16, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 16, 2024

Thanks again @joroKr21 and @lewiszlw

@joroKr21 joroKr21 deleted the encode-decode-udaf branch July 17, 2024 05:00
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 17, 2024
…#11417)

* Add extension hooks for encoding and decoding UDAFs and UDWFs

* Add tests for encoding and decoding UDAF
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
…#11417)

* Add extension hooks for encoding and decoding UDAFs and UDWFs

* Add tests for encoding and decoding UDAF
wiedld pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 31, 2024
…#11417)

* Add extension hooks for encoding and decoding UDAFs and UDWFs

* Add tests for encoding and decoding UDAF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle Serde for Custom AggregateUDFImpl traits
5 participants