-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Hi there,
I'm completely new to Apache Beam and its programming model is quite surprising, and while trying to workaround a Parquet writer while reading from a PubSub Writer I can't wrap my head around the following...
https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/master/src/main/java/com/google/cloud/teleport/templates/PubsubToAvro.java taking that template as base.
Seems that AvroIO has support for windowed writes to buckets such as gs://my-bucket/YYYY/MM/DD, being 'YYYY' variables automatically filled at runtime by the AvroIO handler.
Is there any way to achieve this using ParquetIO? The only bits of Parquet I've seen are the following ones, but none of them write by date...
Tried a first approach and, even if the code compile and runs one event after another, I can't get it to run in local with DirectRunner and against a bucket. The code I've got so far is the following one.
SOLVED