Skip to content

Support encoding with fastavro, fix minor doc bug#419

Closed
marcintustin wants to merge 2 commits into
confluentinc:masterfrom
marcintustin:use-fastavro
Closed

Support encoding with fastavro, fix minor doc bug#419
marcintustin wants to merge 2 commits into
confluentinc:masterfrom
marcintustin:use-fastavro

Conversation

@marcintustin
Copy link
Copy Markdown

This PR adds the ability to encode with fastavro, if it is available. Also, fixes a minor misprecision in the docstring for encode_record_with_schema_id.

@ghost
Copy link
Copy Markdown

It looks like marcintustin hasn't signed our Contributor License Agreement, yet.

The purpose of a CLA is to ensure that the guardian of a project's outputs has the necessary ownership or grants of rights over all contributions to allow them to distribute under the chosen licence.
Wikipedia

You can read and sign our full Contributor License Agreement here.

Once you've signed reply with [clabot:check] to prove it.

Appreciation of efforts,

clabot

@ghost
Copy link
Copy Markdown

Confluent Inc. (@confluentinc) It looks like marcintustin just signed our Contributor License Agreement. 👍

Always at your service,

clabot

@marcintustin
Copy link
Copy Markdown
Author

The failure here appears to be with one of the workers timing out. I can't force a re-run, I'm afraid.

@rnpridgeon
Copy link
Copy Markdown
Contributor

Thanks marcintustin! I'll take a look tomorrow, in the meantime I have forced a restart of the failed worker.

Copy link
Copy Markdown
Contributor

@rnpridgeon Ryan P (rnpridgeon) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this, I have made a few suggestions to help DRY things up a bit but for the most part it looks good.

'''

def _slow_make_datum_writer(self, schema):
return avro.io.DatumWriter(schema)
Copy link
Copy Markdown
Contributor

@rnpridgeon Ryan P (rnpridgeon) Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although a bit ugly it would be nice to share the interface of the writer returned by _fast_make_datum_writer

def _slow_make_datum_writer(self, schema):
   writer = avro.io.DatumWriter(schema)
   return lambda record, fp: writer.write(record, avro.io.BinaryEncoder(fp))


def _fast_make_datum_writer(self, record_schema):
schema = record_schema.to_json()
return lambda fp, record: schemaless_writer(fp, schema, record)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would transpose lambda arguments fp and record to maintain consistency with the existing writer interface.


# cache writer
self.id_to_writers[schema_id] = avro.io.DatumWriter(schema)
if id not in self.id_to_writers:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this method is also responsible for the schema's registration I do not think this check provides us any benefit.

if not schema:
raise serialize_err("Schema does not exist")
self.id_to_writers[schema_id] = avro.io.DatumWriter(schema)
self.id_to_writers[schema_id] = self.make_datum_writer(schema)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this logic back into _encode_record_with_schema_id and eliminate the redundant _slow_encode_record_with_schema_id and _fast_encode_record_with_schema_id methods. If we provide the same interface for both writers there is no longer a need to maintain two code paths which is a bit more DRY.

@rnpridgeon
Copy link
Copy Markdown
Contributor

Thanks for the contribution marcintustin , I went ahead and made the requested changes in #492 to try and get this in for the 1.0 release which enters code freeze at EOD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants