-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeDecodeError with html.table_schema = True #16848
Comments
TomAugspurger
added
IO JSON
read_json, to_json, json_normalize
Unicode
Unicode strings
Difficulty Intermediate
labels
Jul 7, 2017
The underlying problem is in the call to In [4]: pd.Series([b'\x00\x00\x00\x00\x00\x01\x82S']).to_json() ---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-4-13bdc7ec1cc6> in <module>()
----> 1 pd.Series([b'\x00\x00\x00\x00\x00\x01\x82S']).to_json()
~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/generic.py in to_json(self, path_or_buf, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines)
1250 force_ascii=force_ascii, date_unit=date_unit,
1251 default_handler=default_handler,
-> 1252 lines=lines)
1253
1254 def to_hdf(self, path_or_buf, key, **kwargs):
~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/io/json/json.py in to_json(path_or_buf, obj, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines)
46 obj, orient=orient, date_format=date_format,
47 double_precision=double_precision, ensure_ascii=force_ascii,
---> 48 date_unit=date_unit, default_handler=default_handler).write()
49
50 if lines:
~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/io/json/json.py in write(self)
90 date_unit=self.date_unit,
91 iso_dates=self.date_format == 'iso',
---> 92 default_handler=self.default_handler
93 )
94
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 42: invalid start byte I'm not sure how tricky it would be to pass through. The standard-library doesn't even try to serialze bytes import json
In [9]: json.dumps(s[0]) ---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-9-4b0d6b435871> in <module>()
----> 1 json.dumps(s[0])
/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
229 cls is None and indent is None and separators is None and
230 default is None and not sort_keys and not kw):
--> 231 return _default_encoder.encode(obj)
232 if cls is None:
233 cls = JSONEncoder
/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py in encode(self, o)
197 # exceptions aren't as detailed. The list call should be roughly
198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
200 if not isinstance(chunks, (list, tuple)):
201 chunks = list(chunks)
/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py in iterencode(self, o, _one_shot)
255 self.key_separator, self.item_separator, self.sort_keys,
256 self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)
258
259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,
/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py in default(self, o)
178 """
179 raise TypeError("Object of type '%s' is not JSON serializable" %
--> 180 o.__class__.__name__)
181
182 def encode(self, o):
TypeError: Object of type 'bytes' is not JSON serializable Either way, we don't want |
jbrockmendel
added
IO HTML
read_html, to_html, Styler.apply, Styler.applymap
and removed
Effort Medium
labels
Oct 16, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have a DataFrame with a column of binary data which has an object dtype.
If
display.html.table_schema
isFalse
then this displays fine with the display just being the repr of the byte string.If I set
display.html.table_schema = True
then attempting to display the DataFrame throws aUnicodeDecodeError
from the json conversion.It would be good if the display also worked when using the
table_schema
option.Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: