-
Notifications
You must be signed in to change notification settings - Fork 14
Add scripts for remote_inference #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add scripts for remote_inference #4
Conversation
There are two scripts based whether the input data is in an object table or a native BigQuery table. | ||
|
||
## Object table script | ||
The object table script creates a target table to store successful annotations. To do this, it calls the inference in a loop. In the first iteration, a small LIMIT is set on the inference call to quickly create a table with the desired schema. The number of rows to process for each inference call can be modified through the batch_size parameter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just for annotations, but for the result of the ML operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Use "`" to quote the batch_size
to make it clear it's a code parameter.
This script applies to the following models: | ||
- ML.ANNOTATE_IMAGE | ||
- ML.PROCESS_DOCUMENT | ||
- ML.TRANSCRIBE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for the ML.GENERATE_TEXT
with the vision model.
ml_query | ||
DEFAULT | ||
FORMAT( | ||
"SELECT %s, text AS content FROM `%s`", ARRAY_TO_STRING(key_columns, ','), source_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the default, probably simpler to just have it as:
DECLARE ml_query DEFAULT "SELECT *, /* ML operation dependent field */ FROM `" || source_table || "`";
The README.md includes instructions to run the scripts. There are four scripts in total. Two are for object tables and the other two are for structured tables.