Skip to content

Conversation

@Rufflewind
Copy link
Contributor

  1. Generator and StringIO are replaced by BytesGenerator and BytesIO. If BytesGenerator doesn't exist (as is the case in Python 2), fall back to Generator.
  2. BytesGenerator is buggy (see #18886 and #19003) and corrupts \r into \n. To work around this, we implement a patched version of BytesGenerator that replaces ._write_lines with just .write.

Note: BytesGenerator was introduced in Python 3.2. This is OK since the library already demands 3.3+.

The test_multipart_media_good_upload has been updated to reflect the change. It is also stricter now, as it matches the entire request body against the expected form.

Fixes #145.


This ensures that body is of type bytes rather than str.

I did some tests to make sure the new code is still sensible: the flattened MIME of a simple upload of a random 1MiB file on Python 3.5.0 after the fix is identical to that on Python 2.7.10 before and after the fix (modulo dict rearrangements and MIME boundaries).

@googlebot
Copy link
Collaborator

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please let us know the company's name.

@googlebot googlebot added the cla: no This human has *not* signed the Contributor License Agreement. label Oct 13, 2015
@Rufflewind
Copy link
Contributor Author

I signed it!

@googlebot
Copy link
Collaborator

CLAs look good, thanks!

@googlebot googlebot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Oct 14, 2015
@nathanielmanistaatgoogle
Copy link
Contributor

What hope is there for test coverage for this change? And what's up with the test failures seen for the current draft of your change?

@Rufflewind
Copy link
Contributor Author

The broken test is now fixed. It is also stricter now and should serve as a regression test for this bug.

@nathanielmanistaatgoogle
Copy link
Contributor

This looks correct to me.

@tmatsuo would you mind being a second pair of eyes on this? The try/except import and monkeypatching are giving me the heebie-jeebies.

This comment was marked as spam.

@Rufflewind
Copy link
Contributor Author

tmatsuo: What happens if g.flatten throws an Exception? I assume BytesGenerator.write will still be patched.

Fixed.

-          g.flatten(msgRoot, unixfrom=False)
-          if write_lines:
-            BytesGenerator._write_lines = write_lines
+          try:
+            g.flatten(msgRoot, unixfrom=False)
+          finally:
+            if write_lines:
+              BytesGenerator._write_lines = write_lines

@tmatsuo
Copy link
Contributor

tmatsuo commented Oct 14, 2015

I'm still worrying when the code is used in multi threaded code. I think some threads might pick the patched version as original.

@nathanielmanistaatgoogle
Copy link
Contributor

Let me ask an alarmingly naïve question that reveals how I see the world with precious, innocent, childlike wonder: how has Python bug 18886 stayed open so long if it is so easy to work around?

@Rufflewind
Copy link
Contributor Author

I'm still worrying when the code is used in multi threaded code. I think some threads might pick the patched version as original.

Found a better way to do this that sidesteps the issue :)

Let me ask an alarmingly naïve question that reveals how I see the world with precious, innocent, childlike wonder: how has Python bug 18886 stayed open so long if it is so easy to work around?

There could be a subtlety: some applications may have a preference for \r\n vs \n in the header lines, which slightly complicates the fix on their end. This is does not matter for googleapiclient since the Google API appears to tolerate \n just fine.

The same bug has also been reported as #19003, which includes a patch.

This comment was marked as spam.

@Rufflewind
Copy link
Contributor Author

nathanielmanistaatgoogle: If you feel that 19003 is a duplicate of 18886, put both on this line?

Done.

 1. Generator and StringIO are replaced by BytesGenerator and BytesIO.
    If BytesGenerator doesn't exist (as is the case in Python 2), fall
    back to Generator.

 2. BytesGenerator is buggy [1] [2] and corrupts '\r' into '\n'.  To
    work around this, we implement a patched version of BytesGenerator
    that replaces ._write_lines with just .write.

The test_multipart_media_good_upload has been updated to reflect the
change.  It is also stricter now, as it matches the entire request body
against the expected form.

Note: BytesGenerator was introduced in Python 3.2.  This is OK since the
      library already demands 3.3+.

Fixes #145.

[1]: https://bugs.python.org/issue18886
[2]: https://bugs.python.org/issue19003
@tmatsuo
Copy link
Contributor

tmatsuo commented Oct 15, 2015

Thanks, LGTM

@nathanielmanistaatgoogle
Copy link
Contributor

Thank you for this conversation and for your contribution to the library!

nathanielmanistaatgoogle added a commit that referenced this pull request Oct 15, 2015
Fix non-resumable binary uploads on Python 3.
@nathanielmanistaatgoogle nathanielmanistaatgoogle merged commit eca709b into googleapis:master Oct 15, 2015
@journeytosilius
Copy link

Is the available master the one that installs via pip ? Cause I am having exactly the original issue that lead to this patch !

@nathanielmanistaatgoogle
Copy link
Contributor

@chromafunk: unfortunately no, we're overdue for a release. Perhaps in the next week? Fingers crossed?

@journeytosilius
Copy link

@nathanielmanistaatgoogle I have modified discovery.py with the patch by @Rufflewind myself but the problem persist when using resumable = False, do you want me to help on this ? I can dump output if you want it.

@naoyak
Copy link

naoyak commented Feb 12, 2016

A new release would be terrific!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failure when uploading binary files on Python 3 when resumable=False

6 participants