Skip to content

Prepared statement optimisation #17497

Open
@systay

Description

When running queries inside prepared statements, the planner doesn't see values that would be available to it during normal, not-prepared, query execution. Example:

select 1 from user, user_extra where user.id = 1 and user_extra.user_id = 1

produces the plan:

{
  "QueryType": "SELECT",
  "Original": "select 1 from user, user_extra where user.id = 1 and user_extra.user_id = 1",
  "Instructions": {
    "OperatorType": "Route",
    "Variant": "EqualUnique",
    "Keyspace": {
      "Name": "user",
      "Sharded": true
    },
    "FieldQuery": "select 1 from `user`, user_extra where 1 != 1",
    "Query": "select 1 from `user`, user_extra where `user`.id = 1 and user_extra.user_id = 1",
    "Table": "`user`, user_extra",
    "Values": [
      "1"
    ],
    "Vindex": "user_index"
  },
  "TablesUsed": [
    "user.user",
    "user.user_extra"
  ]
}

Unfortunately, when running in prepared statement, the planner sees ? instead of literal values, and won't know that the two sharding keys are pointing to the same value:

select 1 from user, user_extra where user.id = ? and user_extra.user_id = ?

produces:

{
  "QueryType": "SELECT",
  "Original": "select 1 from user, user_extra where user.id = ? and user_extra.user_id = ?",
  "Instructions": {
    "OperatorType": "Join",
    "Variant": "Join",
    "JoinColumnIndexes": "L:0",
    "TableName": "`user`_user_extra",
    "Inputs": [
      {
        "OperatorType": "Route",
        "Variant": "EqualUnique",
        "Keyspace": {
          "Name": "user",
          "Sharded": true
        },
        "FieldQuery": "select 1 from `user` where 1 != 1",
        "Query": "select 1 from `user` where `user`.id = :v1",
        "Table": "`user`",
        "Values": [
          ":v1"
        ],
        "Vindex": "user_index"
      },
      {
        "OperatorType": "Route",
        "Variant": "EqualUnique",
        "Keyspace": {
          "Name": "user",
          "Sharded": true
        },
        "FieldQuery": "select 1 from user_extra where 1 != 1",
        "Query": "select 1 from user_extra where user_extra.user_id = :v2",
        "Table": "user_extra",
        "Values": [
          ":v2"
        ],
        "Vindex": "user_index"
      }
    ]
  },
  "TablesUsed": [
    "user.user",
    "user.user_extra"
  ]
}

To address this issue, the planner could defer planning until the first execution of the prepared statement. During the initial execution, actual values for the bind variables (?) would be available, allowing the planner to determine if the sharding keys point to the same value. Using this information, the planner could create a more optimal plan.

Once the plan is generated, it would be cached. To ensure subsequent executions remain valid, the cache would require a mechanism to validate that the bind variables in future executions match the expected pattern. If the validation fails (e.g., bind variables do not follow the required relationship), the planner could either fall back to a generic plan or replan dynamically. This ensures correctness while optimizing performance for typical cases.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions