Skip to content

Can't pass binary data to command via stdin #325

Closed
@polygon

Description

@polygon

The following code does not work as expected:

import sh
data = b'124343'
print(sh.cat(_in=data))

I'd expect it to pass the content of data via stdin to cat and hence see the output 124343, however:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/threading.py", line 911, in _bootstrap_inner
    self.run()
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/threading.py", line 859, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/site-packages/sh.py", line 1453, in input_thread
    done = stdin.write()
  File "/Users/jan/anaconda/envs/py3k/lib/python3.4/site-packages/sh.py", line 1799, in write
    self.log.debug("got chunk size %d: %r", len(proc_chunk),
TypeError: object of type 'int' has no len()

This is likely because determine_how_to_read_input(input_obj) does not check for a bytes type and will resort to the default iter_chunk_reader which will iterate over each element one-by-one. Contrary to a str where each element is still a str, iterating over bytes will return a series of integers with the result above. I think the problem can be fixed by special handling of bytes types inside that function. I might add a PR later for that.

For now, a usable workaround is to use a BinaryIO buffer so that the file_chunk_reader is being used.

import sh
import io

data = b'124343'
buffer = io.BytesIO(data)
print(sh.cat(_in=buffer))

The result will work as expected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions