Skip to content

Document all columns in mart_ntd & review YAML #3521

Closed

Description

As a data scientist I want all of our columns to be documented in dbt so that future maintainers and users of the warehouse will understand what each column is and how it should be used. I want to follow what's been done in our warehouse for GTFS and use that documentation pattern for NTD.

Acceptance Criteria:

  • Add documentation for all columns in mart_ntd currently lacking documentation (see query below)
  • Audit associated dbt YAML:
    • For YAML files longer than ~10 models with common anchors (used by more than ~3 models), define the anchors at the very top of the file, as done here: https://github.com/cal-itp/data-infra/blob/main/warehouse/models/mart/gtfs/_mart_gtfs_dims.yml#L3
    • Check that anchors are being used appropriately: if there's a common field with equivalent description, an anchor should be used; if there's a field with the same name that doesn't use the anchor, consider using the anchor and overriding the part that's different or adding a comment about why this instance can't use the anchor
    • Review field documentation and evaluate for completeness/correctness
-- identify columns missing documentation 
SELECT *
FROM `cal-itp-data-infra`.`mart_ntd`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE description IS NULL
ORDER BY table_name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions