Skip to content

feat: Analyse different strategies and add validation for missing fields in parquet data compared to protobuf schema #102

@Meghajit

Description

@Meghajit

Extra fields present in the parquet data not present in the protobuf schema will be ignored.
However, it might be possible that:

  • there are some fields in protobuf schema which are missing in the parquet data
  • field names are same but the data type is different

We would need answers to as well as solve for :

  1. Should the Parquet Data Source set default values for fields which are not found in the parquet file but present in the schema ? If yes, what should be the default value ?
  2. If no defaults are wanted to be set, should the Dagger job fail ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions