feat: set query label session property in bq session #4314

ncbkr · 2025-05-05T17:17:32Z

we want to use the SQLMesh session properties to attach labels to the BigQuery jobs. Using these labels we want to enable ourselves to

map BigQuery jobs to SQLMesh models
map BigQuery jobs to owners/users of SQLMesh

In the model definition, this would like as follows

session_properties (
    query_label = [
      ('label1', 'value1'),
      ('label2', 'value2'),
      ('label3', 'value3')
    ]
  )

This is currently not possible in an easy way since all jobs are created by the same service account.

The screenshot below displays the information about a BigQuery job created by sqlmesh to backfill a model. In the last line you can see the label my_session_property_key: my_session_property_value. This comes from the session_properties of the sqlmesh model definition. According to the google docs for BigQuery, all jobs that are run within a session inherit the labels set at the beginning of the session using SET @@query_label = <my_labels>

izeigerman · 2025-05-05T17:31:48Z

sqlmesh/core/engine_adapter/bigquery.py

@@ -182,7 +182,32 @@ def query_factory() -> Query:
    def _begin_session(self, properties: SessionProperties) -> None:
        from google.cloud.bigquery import QueryJobConfig

-        job = self.client.query("SELECT 1;", job_config=QueryJobConfig(create_session=True))
+        query_label = []
+        if "query_label" in properties and isinstance(properties["query_label"], exp.Array):


I think this can also be exp.Tuple not just exp.Array. Eg:

query_labels = ['label_a', 'label_b']

vs

query_labels = ('label_a', 'label_b')

Yep, perhaps also Paren as an edge case.

Sorry for not including a usage example in the PR description. I imagined the model definition to look like this

session_properties ( query_label = [ ('key1', 'value1'), ('key2', 'value2'), ('key3', 'value3') ] )

I chose this format based on the google docs:

a key-value pair for the label to assign to all future queries in the session. You can also add multiple key-value pairs, separated by a comma (for example, SET @@query_label = "key1:value1,key2:value2").

Also, this is consistent with the usage of labels in both, the virtual and phyiscal properties.

This format is also why I explicitely check for an array.

What do you think about this format? Would you prefer another one?

e.g. choosing this format

query_labels = ['label_a', 'label_b'] | ('label_a', 'label_b') # array or tuple

label_a and label_b would have to be a string following this pattern: key:value

@georgesittas would this be an edge case you had in mind?

query_labels = ('label_a')

So query_label is not a tuple or array but a single string 'key:vale' wrapped in parentheses?

The format makes sense, but it might also be:

session_properties ( query_label = ( ('key1', 'value1'), ('key2', 'value2'), ('key3', 'value3') ) )

which would make a tuple of tuples rather than array of tuples.
Similarly

session_properties ( query_label = ( ('key1', 'value1') ) )

makes it a Paren of tuples.

izeigerman · 2025-05-05T17:32:47Z

sqlmesh/core/engine_adapter/bigquery.py

+                if (
+                    isinstance(label_tuple, exp.Tuple)
+                    and len(label_tuple.expressions) == 2
+                    and all(isinstance(label, exp.Literal) for label in label_tuple.expressions)


what if it's an identifier and not a string literal eg: query_labels = [ label_a ] and not query_labels = [ 'label_a' ]. Might worth support both approaches.

We could simply require strings; not sure what's the value of supporting identifiers here. Seems like an unnecessary increase in scope?

izeigerman · 2025-05-05T17:33:26Z

sqlmesh/core/engine_adapter/bigquery.py

+                    )
+                else:
+                    raise SQLMeshError(
+                        "Invalid query label format. Expected a tuple of two string literals."


I feel like this is something that should be captured at model load time and not at execution time. This seems too late.

izeigerman · 2025-05-05T17:33:45Z

Thanks for your contribution! Can you please add some tests.

feat: set query label session property in bq session

28ce66a

izeigerman reviewed May 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: set query label session property in bq session #4314

feat: set query label session property in bq session #4314

ncbkr commented May 5, 2025 •

edited

Loading

izeigerman May 5, 2025

georgesittas May 5, 2025

ncbkr May 6, 2025 •

edited

Loading

ncbkr May 6, 2025

izeigerman May 6, 2025 •

edited

Loading

izeigerman May 5, 2025

georgesittas May 7, 2025

izeigerman May 5, 2025

izeigerman commented May 5, 2025

feat: set query label session property in bq session #4314

Are you sure you want to change the base?

feat: set query label session property in bq session #4314

Conversation

ncbkr commented May 5, 2025 • edited Loading

izeigerman May 5, 2025

Choose a reason for hiding this comment

georgesittas May 5, 2025

Choose a reason for hiding this comment

ncbkr May 6, 2025 • edited Loading

Choose a reason for hiding this comment

ncbkr May 6, 2025

Choose a reason for hiding this comment

izeigerman May 6, 2025 • edited Loading

Choose a reason for hiding this comment

izeigerman May 5, 2025

Choose a reason for hiding this comment

georgesittas May 7, 2025

Choose a reason for hiding this comment

izeigerman May 5, 2025

Choose a reason for hiding this comment

izeigerman commented May 5, 2025

ncbkr commented May 5, 2025 •

edited

Loading

ncbkr May 6, 2025 •

edited

Loading

izeigerman May 6, 2025 •

edited

Loading