Skip to content

Serialize user defined functions and table providers via protobuf #1181

Open
@timsaucer

Description

@timsaucer

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

PyLogicalPlan can currently only serialize or deserialize built in functions and table providers. It currently uses DefaultLogicalExtensionCodec in this function.

Users would like to be able to serialize and deserialize plans that include custom functions and table providers. One such example is the Iceberg python integration.

This topic was mentioned in the #datafusion-python channel in Discord.

Describe the solution you'd like

Create a logical extension codec that can process user defined functions and table providers. This likely means some work in the upstream repository.

One approach could consist of:

  • Add encode and decode methods to the FFI_TableProvider, FFI_ScalarUDF and so on.
  • Implement LogicalExtensionCodec on ForeignTableProvider, ForeignScalarUDF, and so on where the only methods implemented are those appropriate to table provide, udf, etc. That is for ForeignTableProvider it only supports try_encode_table_provider and try_decode_table_provider.
  • Add a LogicalExtensionCodec to datafusion-python. For the table providers, iterate through all registered table providers and see if any of them successfully decode. The encode method is more straightforward.

Describe alternatives you've considered

The user could output the logical plan as a SQL command and then pass that along as a string.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions