Skip to content

Unexpected error message Unknown statement at LexToken( "ARRAY,"STRIP_OUTER_ARRAY") Snowflake reserved keyword #276

@dmaresma

Description

@dmaresma

Describe the bug
The following test doesn't pass :
`ddl = """
create external table if not exists TABLE_DATA_SRC.EXT_PAYLOAD_MANIFEST_WEB (
"type" VARCHAR(255) AS (SPLIT_PART(SPLIT_PART(METADATA$FILENAME, '/', 1), '=', 2 )),
"year" VARCHAR(255) AS (SPLIT_PART(SPLIT_PART(METADATA$FILENAME, '/', 2), '=', 2)),
"month" VARCHAR(255) AS (SPLIT_PART(SPLIT_PART(METADATA$FILENAME, '/', 3), '=', 2)),
"day" VARCHAR(255) AS (SPLIT_PART(SPLIT_PART(METADATA$FILENAME, '/', 4), '=', 2)),
"cast_YEAR" VARCHAR(200) AS (GET(VALUE,'c1')::string),
"path" VARCHAR(255) AS (METADATA$FILENAME)
)
partition by ("type", "year", "month", "day", "path")
location=@ADL_Azure_Storage_Account_Container_Name/year=2023/month=08/
auto_refresh=false
pattern='*.csv'
file_format = (TYPE = JSON NULL_IF = () STRIP_OUTER_ARRAY = TRUE )
;
"""
result_ext_table = DDLParser(ddl, normalize_names=True, debug=True).run(
output_mode="snowflake"
)

expected_ext_table = [
    {
        "alter": {},
        "checks": [],
        "clone": None,
        "columns": [
            {
                "name": "type",
                "type": "VARCHAR",
                "size": 255,
                "references": None,
                "unique": False,
                "nullable": True,
                "default": None,
                "check": None,
                "generated": {
                    "as": "SPLIT_PART(SPLIT_PART(METADATA$FILENAME,'/',1),'=',2)"
                },
            },
            {
                "name": "year",
                "type": "VARCHAR",
                "size": 255,
                "references": None,
                "unique": False,
                "nullable": True,
                "default": None,
                "check": None,
                "generated": {
                    "as": "SPLIT_PART(SPLIT_PART(METADATA$FILENAME,'/',2),'=',2)"
                },
            },
            {
                "name": "month",
                "type": "VARCHAR",
                "size": 255,
                "references": None,
                "unique": False,
                "nullable": True,
                "default": None,
                "check": None,
                "generated": {
                    "as": "SPLIT_PART(SPLIT_PART(METADATA$FILENAME,'/',3),'=',2)"
                },
            },
            {
                "name": "day",
                "type": "VARCHAR",
                "size": 255,
                "references": None,
                "unique": False,
                "nullable": True,
                "default": None,
                "check": None,
                "generated": {
                    "as": "SPLIT_PART(SPLIT_PART(METADATA$FILENAME,'/',4),'=',2)"
                },
            },
            {
                "name": "cast_YEAR",
                "type": "VARCHAR",
                "size": 200,
                "references": None,
                "unique": False,
                "nullable": True,
                "default": None,
                "check": None,
                "generated": {
                    "as": "GET(VALUE,'c1') ::string"
                },
            },
            {
                "name": "path",
                "type": "VARCHAR",
                "size": 255,
                "references": None,
                "unique": False,
                "nullable": True,
                "default": None,
                "check": None,
                "generated": {"as": "METADATA$FILENAME"},
            },
        ],
        "index": [],
        "partition_by": {
            "columns": ["type", "year", "month", "day", "path"],
            "type": None,
        },
        "partitioned_by": [],
        "primary_key": [],
        "primary_key_enforced": None,
        "schema": "TABLE_DATA_SRC",
        "table_name": "EXT_PAYLOAD_MANIFEST_WEB",
        "tablespace": None,
        "external": True,
        "if_not_exists": True,
        "location": "@ADL_Azure_Storage_Account_Container_Name/year=2023/month=08/",
        "table_properties": {
            "auto_refresh": False,
            "pattern": "'*.csv'",
            "file_format" : {
                "TYPE" : "JSON",
                "NULL_IF": "()",
                "STRIP_OUTER_ARRAY" : "TRUE",
            }
        },
    }
]

assert result_ext_table == expected_ext_table`

To Reproduce
Steps to reproduce the behavior:
run the test

Expected behavior
a success

Additional context
the file format STRIP_OUTER_ARRAY trigger the def tokens_not_columns_names(self, t: LexToken) -> LexToken: t_tag = self.parse_tags_symbols(t) if t_tag: return t_tag if "ARRAY" in t.value : t.type = "ARRAY" return t the if ARRAY statement it should not, adding a len<15 could solve the problem

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions