-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Support read of repeated string fields in BigQuery #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support read of repeated string fields in BigQuery #68
Conversation
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
|
@googlebot I signed it! |
|
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
sabhyankar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! Can you take a look at the comments and make the necessary corrections?
| switch (column.getMode()) { | ||
| case "NULLABLE": | ||
| case "REQUIRED": | ||
| String strValue = columnValue.toString(); | ||
| valueBuilder.setStringValue(strValue); | ||
| boolean excludeFromIndexes = strValue.getBytes().length > MAX_STRING_SIZE_BYTES; | ||
| valueBuilder.setExcludeFromIndexes(excludeFromIndexes); | ||
| break; | ||
|
|
||
| case "REPEATED": | ||
|
|
||
| ArrayValue.Builder arrayValueBuilder = ArrayValue.newBuilder(); | ||
| boolean excludeArrayFromIndexes = false; | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| List<Object> objectList = (List<Object>) columnValue; | ||
| for (Object object: objectList) { | ||
| String arrayStringValue = object.toString(); | ||
| if (arrayStringValue.getBytes().length > MAX_STRING_SIZE_BYTES) { | ||
| excludeArrayFromIndexes = true; | ||
| } | ||
| arrayValueBuilder.addValues(Value.newBuilder().setStringValue(arrayStringValue).build()); | ||
| } | ||
| valueBuilder.setArrayValue(arrayValueBuilder.build()); | ||
| valueBuilder.setExcludeFromIndexes(excludeArrayFromIndexes); | ||
| break; | ||
|
|
||
| } | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading this correctly, we are only handling REQUIRED and REPEATED fields for STRING types? Why not extend this to the other supported data types (INT64, FLOAT etc)?
| valueBuilder.setStringValue(strValue); | ||
| boolean excludeFromIndexes = strValue.getBytes().length > MAX_STRING_SIZE_BYTES; | ||
| valueBuilder.setExcludeFromIndexes(excludeFromIndexes); | ||
| switch (column.getMode()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add unit test coverage as well.
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
* Dml integration (#53) * Added extensive UT Added extensive UT * Cassandra pr bug fixes (#57) * Cassandra Consolidate Unit Test case and Regression testing fixes (#58) * Added Mapping fixes * Added Spoltles fixes * Added Consolidated fixes * Added TODO * Addess Data and Time * Cassandra pr bug fixes (#64) * Handle TypeHandler Parsing issue fixes (#65) Co-authored-by: pawankashyapollion <v-pawan.kumar@ollion.com> * Added Safe handle (#68) * Handle LocalTime For Time Data Type In Cassandra (#69) * Cassandra pr bug fixes (#70) * Handle Timestamp Fixes (#72) * Added Code Combined in a single way * Address The Unwanted Hop * Cassandra pr bug fixes (#75) * Added PR Review Comments * Remove NamesCol Dependecy as spannerTableName is same as In Given Mapping * Added spannerTableId for fetching Mapping * Removed SpannerToID and also Updated Session file with proper structure * Timestamp in milisecond * removed assertNotNull from UT wherever possible * Added Fixes * Added Note Instead of Question * -- review fixes (#78) * Added Bytes to hex to blob conversion * Handling Bytes as Binary encoded As of now * Passing Null Value to Primary Key as well for cassandra * Added UT fixes * Added UT refectoring * Reverse merge confict fixes --------- Co-authored-by: pawankashyapollion <v-pawan.kumar@ollion.com> Co-authored-by: Akash Thawait <aakash@ollion.com>
I want to read a repeated string field from BigQuery, but there is no check for field mode in the
BigQueryConvertersclass.This adds such a check for
STRINGfields only. This should be made more general, though,