-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigQuery Time Values May Be Incorrect #1125
Comments
@arterrey Can you elaborate where you are "getting values out of BigQuery in seconds"? Like an example use of this library and how the output differs from what you expect? |
Also making it require a float (or int) instead of bailing out for NoneType. Changing the use of the method in bigquery to not pass potentially null values and to multiply by 1000.0 (convert from millis to micros). Also updating micros -> datetime conversion in datastore to use the newly converted method.
The field value from the web interface is "2012-06-12 08:05:14 UTC". From the pdb session below you can see that the value returns a dattime in 1970 rather then 2012.
|
That's really great info! Thanks for showing the It looks like value == 1339.488314009370 Which request did you make to retrieve this data? Can you provide a code snippet? |
I think you mean value == 1339488314.00937 just using def fetch_table(data_set, table_name):
table = data_set.table(table_name)
assert table.exists()
table.reload()
next_page = None
while (True):
rows, total_count, next_page = table.fetch_data(page_token=next_page)
for row in rows:
# use schema to convert tuple into dict
row_dict = {}
for i in range(len(table.schema)):
field = table.schema[i]
row_dict[field.name] = row[i]
yield PathableDict(row_dict)
if next_page is None:
break |
Sorry I overlooked the actual value: (Pdb) value
u'1.33948831400937E9' It seems that instead of datetime.timedelta(microseconds=1000.0 * float(value)) we should just be using datetime.timedelta(seconds=float(value)) Like you said, the value is seconds, not milliseconds. I couldn't find any evidence of "seconds" in the discovery doc: FWIW the I'm off to bed for now but will try to throw in some test data (from our system tests) and inspect the output on the APIs explorer: Also FWIW our link in our docstring: Really just rows, total_count, next_page = table.fetch_data(page_token=next_page) was what I was looking for. Thanks! |
I just created a table with the following schema: [
{
"name": "Timestamp",
"type": "TIMESTAMP",
"mode": "REQUIRED"
},
{
"name": "Description",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "Quantity",
"type": "INTEGER",
"mode": "REQUIRED"
},
{
"name": "Unit_Price",
"type": "FLOAT",
"mode": "REQUIRED"
}
] and uploaded the following data: $ cat /tmp/with_timestamp.csv
2015-09-09T17:59:22Z,Testing,1,2.5 In the developer console, the data looks like:
In the API explorer, {
"kind": "bigquery#tableDataList",
"etag": "\"m-9bZ2XqlR7n8jGvqgVmLOTsm1s/xddlFZR3Ob1xPfHtKV-u3wV2TLU\"",
"totalRows": "1",
"rows": [
{
"f": [
{
"v": "1.441821562E9"
},
{
"v": "Testing"
},
{
"v": "1"
},
{
"v": "2.5"
}
]
}
]
} The Preparing data for bigquery docs don't specify how timestamps are returned, only the formats which can be used to load the data. It does say,
|
So what shall we do? Also, how does |
Given that the values are always returned as scientific-formatted floats of seconds (unlike
ISO-8660 strings are one of the supported formats for loading |
I mean what is the codepath. |
|
I see. So the CSV goes directly through GCS, so our code doesn't actually handle the timestamp string? |
Yup. |
On the
_datetime_from_json
function, @arterrey commented thatThis contradicts the note in the code that
The text was updated successfully, but these errors were encountered: