Skip to content

Support nested column descriptions in BigQuery #3854

Open
@tanghyd

Description

I would like to add nested column descriptions to SQLMesh models. If running on BigQuery then the data type would be a RECORD type.

For example I don't just want to document the my_record column, I want to document its nested fields my_record.field_1 and my_record.field_2. All 3 must be described, not just the top level.

This is especially helpful for wide tables that are pre-joined for performance reasons at query time.

I expect the syntax may look like this:

model (
     name my_dataset.my_table,
     description "Contains rows.",
     column_descriptions (
         my_record = "Description of the my_record struct."
         my_record.field_1 = "Description of field 1."
         my_record.field_2 = "Description of field 2."
     )
 );

This functionality is present in dbt but is not available with sqlmesh. See below for how this is documented in a schema.yml file:

      - name: my_record
         description: Description of the my_record struct
         data_type: struct
       - name: my_record.field_1
         description: Description of field 1.
         data_type: string # example
       - name: my_record.field_2
         description: Description of field 2.
         data_type: array<string>  # example

This feature should work regardless if the column is a RECORD (aka STRUCT) or a REPEATED RECORD (aka ARRAY of STRUCTs).

Also see: https://tobiko-data.slack.com/archives/C044BRE5W4S/p1739773563724809?thread_ts=1739773563.724809&cid=C044BRE5W4S

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Labels

ImprovementImproves existing functionality

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions