Iterating BigQuery query results produces incorrect number of rows #506
Closed
Description
This is in relation to post on Stack Overflow post here.
As per the sample code here, a user should be be able to paginate the results of a BigQuery query using the following code structure:
while (result != null) {
Iterator<List<FieldValue>> iter = result.iterateAll();
while (iter.hasNext()) {
List<FieldValue> row = iter.next();
//do something with row/data
System.out.println(row);
}
result = result.getNextPage();
}
}
However, when the result set is large (in my tests >31,000 rows), more rows are returned/iterated even though calling result.getTotalRows()
returns the correct/expected number of rows.
For example, I have a table with 85,250 rows (9.45 MB). When I query it via the Java API, and use the code from the example above, it actually iterates 160,296 times.
Even if I limit the result set in the query using limit 5000
, and set setPageSize(1000L)
, then it iterates 15,000 times e.g:
QueryRequest queryRequest = QueryRequest
.newBuilder("SELECT * FROM [<my-project-id>:<dataset>.<table_with_85250_rows>] limit 5000")
.setUseLegacySql(true)
.setPageSize(1000L)
.build();
QueryResponse response = bigQuery.query(queryRequest);
QueryResult result = response.getResult();
System.out.println("Total rows: " + result.getTotalRows());
Integer rowNumber = 1;
while(result != null){
Iterator<List<FieldValue>> iter = result.iterateAll();
while(iter.hasNext()){
List<FieldValue> row = iter.next();
System.out.println("Row: " + rowNumber + ", with number of columns: " + row.size());
rowNumber++;
}
result = result.getNextPage();
}
Output:
Total rows: 5000
Row: 1, with number of columns: 11
Row: 2, with number of columns: 11
Row: 3, with number of columns: 11
Row: 4, with number of columns: 11
Row: 5, with number of columns: 11
Row: 6, with number of columns: 11
Row: 7, with number of columns: 11
Row: 8, with number of columns: 11
Row: 9, with number of columns: 11
Row: 10, with number of columns: 11
Row: 11, with number of columns: 11
[.....]
Row: 14997, with number of columns: 11
Row: 14998, with number of columns: 11
Row: 14999, with number of columns: 11
Row: 15000, with number of columns: 11
Metadata
Assignees
Labels
No labels