Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce GenericAppenderFactory and GenericAppenderHelper #1340

Merged
merged 1 commit into from
Aug 20, 2020

Conversation

JingsongLi
Copy link
Contributor

In #1293, we found lot of places use switch case avro orc parquet to create appender, these logicals can be extract to GenericAppenderFactory.

return Avro.write(outputFile)
.schema(schema)
.createWriterFunc(DataWriter::create)
.named(fileFormat.name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is usually a table name for the given schema. not fileFormat. we may need to remove this line, or set it with a reasonable table name.

return Parquet.write(outputFile)
.schema(schema)
.createWriterFunc(GenericParquetWriter::buildWriter)
.named(fileFormat.name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

return ORC.write(outputFile)
.schema(schema)
.createWriterFunc(GenericOrcWriter::buildWriter)
.setAll(config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the consistent semantics, I suggest to set overwrite=true. Both flink and spark's FileAppenderFactory have enabled the overwrite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they do.

.withPath(file.toURI().toString())
.withMetrics(appender.metrics())
.withFormat(format);
if (partition != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call the withPartition(partition) directly without this nullable check, because builder will handle this inside it.

@rdblue
Copy link
Contributor

rdblue commented Aug 20, 2020

Thanks @JingsongLi, looks good.

@rdblue rdblue merged commit c66d060 into apache:master Aug 20, 2020
@JingsongLi JingsongLi deleted the appender branch November 5, 2020 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants