Description
This issue documents workaround to make avro
package in Python 3. Might be helpful in the future.
-
avro
package is an indirect dependency ofcwltool
(direct dependency ofschema_salad
tool). It has two separate versions: Python 2 and Python 3. Python 2 version has worked well for us in the past. -
During Python 3 porting process, we found
avro
for Python 3 had a few bugs and didn't work well with CWL Tool even after repeated efforts. However, runningavro-python2
through2to3
fixer and then using it in Python 3 runtime worked well with very minimal changes. -
We tried a lot to convince Apache Avro devs to combine Python 2 and Python 3 package. PR: schema.py: No sys traceback in parse exception apache/avro#235, Use python-modernize to make py code Python 3 compatible apache/avro#234; AVRO-1788: Implement Python 2 API apache/avro#133 Discussion on JIRA: https://issues.apache.org/jira/browse/AVRO-2046, but all this effort was largely met with no response.
-
We finally forked
avro
, modified source a little to releaseavro-cwl
, and use it with Python 3 with help of autotranslate module: basically use 2to3 fixer during runtime.
The workaround is a hack which seems to be work. There are bound to be regressions in this method. In future we can try the following options:
- Get this PR merged, so that a single
avro
package is supported in both Python 2 and 3 out of the box; and no need of fixers. - Consider getting the above PR merged in our fork of
avro
and use that solution, should Apache Avro refuse to accept the patch.
All this was discussed in PR: #442. Might be a useful reference.