Skip to content

dev_tables.json [db_id=="dw"] column_names_original & column_types misaligned #1

@orangepips

Description

@orangepips

The general problem I am working on is converting the dataset into DuckDB format. I've only worked with the "DW" dataset. For DW, it appears the column names and data types are misaligned. After reviewing history it appears this broke with the latest commit.

A good example is:

dw#sep#ZIP_USA

Current (misaligned):

beaver/dev_tables.json

Lines 12807 to 12830 in aa9b0f9

"dw#sep#ZIP_USA": {
"db_id": "dw",
"table_name_original": "ZIP_USA",
"column_names_original": [
"ZIP_CODE",
"ZIP_TYPE",
"CITY_NAME",
"CITY_TYPE",
"COUNTY_NAME",
"STATE_ABBR",
"STATE_NAME",
"WAREHOUSE_LOAD_DATE"
],
"column_types": [
"VARCHAR2",
"DATE",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2"
]
},

Prior (aligned):

beaver/dev_tables.json

Lines 12807 to 12830 in 53c7ac8

"dw#sep#ZIP_USA": {
"db_id": "dw",
"table_name_original": "ZIP_USA",
"column_names_original": [
"STATE_NAME",
"WAREHOUSE_LOAD_DATE",
"ZIP_CODE",
"ZIP_TYPE",
"CITY_NAME",
"CITY_TYPE",
"COUNTY_NAME",
"STATE_ABBR"
],
"column_types": [
"VARCHAR2",
"DATE",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2",
"VARCHAR2"
]
},

Look at the column WAREHOUSE_LOAD_DATE relative to the DATE in both Current and Prior.

I'm guessing this file's generation is automated. Is it possible to regenerate to get all of the names and data types aligned?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions