Skip to content

Invalid JOIN/ON reconstruction #47

@coreybutler

Description

@coreybutler

I'm using this library to parse DDL scripts and validate/document individual statements before running them. This library does a great job of splitting the statements into a slice of Statements. However; since there is no way to access the original statement, we rely on the reconstructed statement (i.e. Statement.String()). I've encountered an issue where the reconstructed JOIN/ON clauses are inaccurate.

The core of the problem (original code vs reconstructed code):

Notice the command sequence in the last three lines: JOIN JOIN ON ON

- FROM worker w
- JOIN event_worker ew ON ew.worker_id = w."id"
- JOIN sys_event se ON se."id" = ew."event"
+ FROM "worker" AS "w"
+ JOIN "event_worker" AS "ew"
+ JOIN "sys_event" AS "se" ON "se"."id" = "ew"."event"
+ ON "ew"."worker_id" = "w"."id"

Here is a snippet of the Go code I'm using (this module aliased as rsql) that stores statements in a slice for later use:

var stmt rsql.Statement
parser := rsql.NewParser(strings.NewReader(s))
statements := make([]rsql.Statement, 0)
ct := 0

for {
  stmt, err = parser.ParseStatement()
  if err != nil {
    if fmt.Sprintf("%s", err) == "EOF" {
      break
    }

    os.WriteFile("debug.sql", []byte(s), os.ModePerm)
    panic(fmt.Errorf("SQL parsing error:%s", err.Error()))
  } else {
    statements = append(statements, stmt)
    ct++
  }
}

Here is one of the source statements that is not being reconstructed properly:

CREATE TRIGGER IF NOT EXISTS tr_worker AFTER INSERT
 ON audit
 FOR EACH ROW
 WHEN "nm" NOT LIKE 'WORKER_%' AND "nm" != 'WEBHOOK_EVENT'
BEGIN
  INSERT INTO audit ("id", "nm", "data", "source_id", "source_table")
  SELECT
      lower(hex(randomblob(16)))
    , 'WORKER_EVENT'
    , json_object(  
        'worker', w."id",
        'data', NEW."data"
      )
    , NEW."id"
    , 'audit'
  FROM worker w
    JOIN event_worker ew ON ew.worker_id = w."id"           --- NOTICE HERE ---
    JOIN sys_event se ON se."id" = ew."event"               --- AND HERE    ---
  WHERE se."id" = NEW."nm"
    AND coalesce(se."expire_dt", datetime(CURRENT_TIMESTAMP, '+1 month')) >= CURRENT_TIMESTAMP
    AND coalesce(se."purge_dt", datetime(CURRENT_TIMESTAMP, '+1 month')) >= CURRENT_TIMESTAMP
    AND CURRENT_TIMESTAMP >= w."valid_start_dt"
    AND CURRENT_TIMESTAMP <= coalesce(w."valid_end_dt", datetime(CURRENT_TIMESTAMP, '+1 month'))
    AND (
      w."dsn_id" IS NULL
      OR coalesce(json_extract(NEW."data", '$.dsn'), '__NONE__') = w."dsn_id" 
    )
  ORDER BY ew."exec_order" ASC; 
END
;

This is how it's being reconstructed (i.e. output of stmt.String(), formatted for readability):

CREATE TRIGGER IF NOT EXISTS "tr_worker"
AFTER INSERT ON "audit"
FOR EACH ROW
WHEN "nm" NOT LIKE 'WORKER_%' AND "nm" != 'WEBHOOK_EVENT'
BEGIN
  INSERT INTO "audit" ("id", "nm", "data", "source_id", "source_table")
  SELECT
    lower(hex(randomblob(16))),
    'WORKER_EVENT',
    json_object(
      'worker', "w"."id",
      'data', "NEW"."data"
    ),
    "NEW"."id",
    'audit'
  FROM "worker" AS "w"
  JOIN "event_worker" AS "ew"                                   --- MAIN ISSUE IS 
  JOIN "sys_event" AS "se" ON "se"."id" = "ew"."event"          --- <-- HERE
  ON "ew"."worker_id" = "w"."id"                                --- <-- AND HERE
  WHERE
    "se"."id" = "NEW"."nm"
    AND coalesce("se"."expire_dt", datetime("CURRENT_TIMESTAMP", '+1 month')) >= "CURRENT_TIMESTAMP"
    AND coalesce("se"."purge_dt", datetime("CURRENT_TIMESTAMP", '+1 month')) >= "CURRENT_TIMESTAMP"
    AND "CURRENT_TIMESTAMP" >= "w"."valid_start_dt"
    AND "CURRENT_TIMESTAMP" <= coalesce("w"."valid_end_dt", datetime("CURRENT_TIMESTAMP", '+1 month'))
    AND (
      "w"."dsn_id" IS NULL
      OR coalesce(json_extract("NEW"."data", '$.dsn'), '__NONE__') = "w"."dsn_id"
    )
  ORDER BY "ew"."exec_order" ASC;
END

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions