Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for postgres bytea type #6987

Merged
merged 1 commit into from
Mar 15, 2019
Merged

Add support for postgres bytea type #6987

merged 1 commit into from
Mar 15, 2019

Conversation

villebro
Copy link
Member

@villebro villebro commented Mar 6, 2019

Psycopg2 returns a memoryview object for bytea type, which can be read by calling .tobytes(). Fixes #6981

Before:

screenshot 2019-03-07 at 7 33 45

screenshot 2019-03-07 at 7 33 57

After

screenshot 2019-03-06 at 21 33 04

screenshot 2019-03-07 at 7 33 10

@codecov-io
Copy link

Codecov Report

Merging #6987 into master will decrease coverage by <.01%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6987      +/-   ##
==========================================
- Coverage   64.38%   64.38%   -0.01%     
==========================================
  Files         421      421              
  Lines       20574    20576       +2     
  Branches     2251     2251              
==========================================
+ Hits        13247    13248       +1     
- Misses       7194     7195       +1     
  Partials      133      133
Impacted Files Coverage Δ
superset/utils/core.py 88.22% <50%> (-0.14%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c1ba914...e876db5. Read the comment docs.

@villebro villebro changed the title Add handling for memoryview Add support for postgres bytea type Mar 8, 2019
@mistercrunch mistercrunch merged commit 5e66008 into apache:master Mar 15, 2019
@mmuru
Copy link
Contributor

mmuru commented Mar 16, 2019

@villebro: I tried to verify this PR fix, now both preview and sqllab run query throws the following exception

2019-03-16 14:25:34,661:ERROR:root:'utf-8' codec can't decode byte 0xac in position 0: invalid start byte
Traceback (most recent call last):
File "/Users/muru/muru-superset/superset/views/core.py", line 2613, in sql_json
encoding=None,
File "/Users/muru/muru-superset/venv367/lib/python3.6/site-packages/simplejson/init.py", line 399, in dumps
**kw).encode(obj)
File "/Users/muru/muru-superset/venv367/lib/python3.6/site-packages/simplejson/encoder.py", line 296, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Users/muru/muru-superset/venv367/lib/python3.6/site-packages/simplejson/encoder.py", line 378, in iterencode
return _iterencode(o, 0)
File "/Users/muru/muru-superset/superset/utils/core.py", line 378, in pessimistic_json_iso_dttm_ser
return json_iso_dttm_ser(obj, pessimistic=True)
File "/Users/muru/muru-superset/superset/utils/core.py", line 360, in json_iso_dttm_ser
val = base_json_conv(obj)
File "/Users/muru/muru-superset/superset/utils/core.py", line 344, in base_json_conv
return str(obj.tobytes(), 'utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in position 0: invalid start byte

@villebro
Copy link
Member Author

Ok let's reopen #6981 and take another stab at this. Any additional info you can give (postgres version, create table script, sample data that throws the error etc) will help track down the problem.

@mmuru
Copy link
Contributor

mmuru commented Mar 18, 2019

@villebro:
The data must be in binary format. As I mentioned, it was decoding issue, binary data using UTF-8.

Here is the test case to reproduce the issue

create table if not exists test_bytea (
b_data bytea
);

Please, unzip and load data using copy
bytea.dat.zip

copy test_bytea from '/Users/muru/Downloads/bytea.dat' WITH (FORMAT Binary);

Ping me if you need any other information.

@villebro
Copy link
Member Author

@mmuru I was unable to get preview working in preview mode without excplicit handling for memoryview, but making this work seems fairly straight forward. Would this be expected behaviour?

Screenshot 2019-03-19 at 8 35 23

@mmuru
Copy link
Contributor

mmuru commented Mar 19, 2019

@villebro: Yes that's correct. I think, instead of displaying content of the binary data we should simply say "binary data" or something similar text. In superset released version 0.28.1, preview mode shows "Unserializable [<class 'memoryview'>]" for bytea column without any error. The issue is the user should able to perform select * from table (table contains bytea column) especially if the table has lot of columns.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: unable to select column with byteb datatype
5 participants