-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
api: bigqueryIssues related to the BigQuery API.Issues related to the BigQuery API.
Description
report in googleapis/google-cloud-node#1593 (comment) from a nodejs app, but since my app can also be written in Python I have a test of Python BigQuery API as well, it seems have a same problem:
$ bq.py --format=prettyjson query --nouse_legacy_sql 'SELECT ARRAY[STRUCT("zx83" AS id, 1 AS a, 2 AS b), ("f8f7", 4, 7)] d' |& cat -An
...
2 [$
3 {$
4 "d": [$
5 {$
6 "a": "1", $
7 "b": "2", $
8 "id": "zx83"$
9 }, $
10 {$
11 "a": "4", $
12 "b": "7", $
13 "id": "f8f7"$
14 }$
15 ]$
16 }$
17 ]$
the problems:
- nested value types are not interpreted, the value types of a and b can be inferred as Integers (from bigquery api response, there are schema including nested field's data type), should be converted to python integers; (the way to fix will need somewhat recursively interpret data types)
- keys order are not preserved, someone here might argue following reasons but there are ways workaround:
- JSON object's key order doesn't matter to machine; but when we print a prettyjson format, it's for human, and keys order does matter
- python's default
json.loads
convert a JSON object to python dict which uses key's internal hash code somewhat unpredictable order, way to fix is to load withOrderedDict
, and drop sort_keys parameter when callingjson.dumps
- trailing space in object key value pair lines, this is trivial and some Python JSON API's own problem, there are many careless python doing this including this project; please read below expected output, can be fixed by
json.dumps(..., separators=(',', ': '))
many tools written in Python suffered the same problem, like what I commented in the popular httpie tool: httpie/cli#427 (comment)
HOW to Fix 2 & 3
In [1]: import json
In [2]: from collections import OrderedDict
In [3]: data = json.loads(
'{"d": [ { "id": "zx83", "a": 1, "b": 2}, { "id": "f8f7", "a": 4, "b": 7 } ]}',
object_pairs_hook=OrderedDict)
In [4]: data
Out[4]:
OrderedDict([(u'd',
[OrderedDict([(u'id', u'zx83'), (u'a', 1), (u'b', 2)]),
OrderedDict([(u'id', u'f8f7'), (u'a', 4), (u'b', 7)])])])
In [5]: json.dumps(data, indent=2, separators=(',', ': '))
Out[5]: '{\n "d": [\n {\n "id": "zx83",\n "a": 1,\n "b": 2\n },\n {\n "id": "f8f7",\n "a": 4,\n "b": 7\n }\n ]\n}'
In [6]: print json.dumps(data, indent=2, separators=(',', ': '))
{
"d": [
{
"id": "zx83",
"a": 1,
"b": 2
},
{
"id": "f8f7",
"a": 4,
"b": 7
}
]
}
Expected output:
2 [$
3 {$
4 "d": [$
5 {$
6 "id": "zx83",$
7 "a": 1,$
8 "b": 2$
9 },$
10 {$
11 "id": "f8f7",$
12 "a": 4,$
13 "b": 7$
14 }$
15 ]$
16 }$
17 ]$
Metadata
Metadata
Assignees
Labels
api: bigqueryIssues related to the BigQuery API.Issues related to the BigQuery API.