Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
PyLogicalPlan
can currently only serialize or deserialize built in functions and table providers. It currently uses DefaultLogicalExtensionCodec
in this function.
Users would like to be able to serialize and deserialize plans that include custom functions and table providers. One such example is the Iceberg python integration.
This topic was mentioned in the #datafusion-python channel in Discord.
Describe the solution you'd like
Create a logical extension codec that can process user defined functions and table providers. This likely means some work in the upstream repository.
One approach could consist of:
- Add
encode
anddecode
methods to theFFI_TableProvider
,FFI_ScalarUDF
and so on. - Implement
LogicalExtensionCodec
onForeignTableProvider
,ForeignScalarUDF
, and so on where the only methods implemented are those appropriate to table provide, udf, etc. That is forForeignTableProvider
it only supportstry_encode_table_provider
andtry_decode_table_provider
. - Add a
LogicalExtensionCodec
todatafusion-python
. For the table providers, iterate through all registered table providers and see if any of them successfully decode. The encode method is more straightforward.
Describe alternatives you've considered
The user could output the logical plan as a SQL command and then pass that along as a string.
Additional context