Skip to content

Csv to Parquet - Nullable types #314

Open
@MaxZ2033

Description

@MaxZ2033

Hello!

First off, great library and very useful for my use-case of converting csv to parquet files. Thanks for all the work you put into this!

I'm currently facing one problem though and hope you can help me out:

The primitives in my parquet files need to be Nullable. I registered this delegate:

c.MapParquetType = type =>
{
    return type switch
    {
        // Ensures that chars are converted to string before they are passed to Parquet.Net.
        // Parquet.Net does not support values of type 'char'.
        not null when type == typeof(char) => typeof(string),
        // The ParquetWriter cannot handle null values of primitives.
        // Null values get converted to their default value.
        not null when type == typeof(long) => typeof(long?),
        not null when type == typeof(double) => typeof(double?),
        _ => type
    };
};

This works as expected. The problem is, that type string is then assigned to these Nullable types:

else if (asRaw && type.IsNullableType())
return typeof(string);

Is there a specific reason for this choice, or is there another way to enable Nullables for the Parquet writer?

(A small change in the RecordWriter code fixes this issue for me. I'll gladly create a PR if this is considered a bug and not a design choice 😉)

Thanks for your time!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions