Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: force column types in view plans (make view descriptors future-proof) #12611

Open
knz opened this issue Dec 30, 2016 · 9 comments
Open

sql: force column types in view plans (make view descriptors future-proof) #12611

knz opened this issue Dec 30, 2016 · 9 comments
Labels
A-schema-changes A-sql-typing SQLtype inference, typing rules, type compatibility. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-todo S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Milestone

Comments

@knz
Copy link
Contributor

knz commented Dec 30, 2016

The problem is as follows; suppose at some time we define a view:

create view v as select somefunc(x+3, y) from ...

then at some later point either of the following happens:

  • we add a new implicit cast rule for literal constants
  • the return type of somefunc changes
  • new overloads are added to somefunc which change the resolution

Then any uses of the view will discover a new type for the columns, effectively changing the semantics of the view. In some cases it could even break (e.g. if the overload resolution for somefunc has become ambiguous)

This is because the code that re-creates the query plan from the view descriptor infers the types using the text of the original view query using the new typing rules.

There are two parts for a full solution to this, but the first will go a long way to make views future-proof:

  1. give the column types stored in the view descriptor as "demanded types" when instantiating the view plan. This will bias type checking strongly in favor of preserving the previous types and catch most problematic cases.
  2. annotate the syntax tree preserved in the view descriptor with the types of every expression node and the specific builtin overload used for each function call, then skip type checking entirely when instantiating the query plan from the view descriptor.

Jira issue: CRDB-6128

@knz knz added this to the 1.0 milestone Dec 30, 2016
@nvanbenschoten
Copy link
Member

I was actually just thinking about this, and I think your two proposed changes sound solid (although I don't think the first one is necessary if we do the second one). They would address your first and third potential issue. However, I don't think they would address the second, which is more tricky and may require function versioning.

IIRC correctly, most SQL databases like Postgres will compile persisted statements to an internal representation before storing. This allows them to avoid these kinds of issues. If we're going to be re-parsing and typing our serialized format, fully type-annotating the expression tree seems like the right approach.

Also, I think we should be doing this in cases like CHECK constraints as well. Anywhere that statements need to be persisted will face these complications.

@knz
Copy link
Contributor Author

knz commented Dec 31, 2016

Yeah perhaps I should mention a few of these ideas in the IR RFC too.

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jan 13, 2017
This change addresses a few related problems (described in more detail
in cockroachdb#12532):
 - ambiguous datum types after serialization
 - invalid prefix casts after expression processing (like `INTERVAL INTERVAL `1.5s`').

Changes:

 - a new FmtFlag adds type annotations to datums and placeholders. This ensures
   that expressions that contain `DDInterval`, `DDate` can be formatted and
   re-parsed correctly. This replaces ad-hoc type prepending like
   `INTERVAL '..'` or `DATE '..'`.

 - `CastExpr.TypeCheck` now returns the underlying type-checked expression
   directly when it is a Datum that we know for sore is of identical type.

 - `castPrefix` (renamed to `castPrepend`) now only functions if the
   underlying expression is a string literal (preventing incorrect expressions
   like `DECIMAL '1.0':::string`). `castPrefixParens` is removed (it
   was incorrect).

 - `AnnotateTypeExpr.TypeCheck` now returns the underlying type-checked
   expression rather than changing the `AnnotateTypeExpr` in-place and
   returning it (which could result in double annotations with the new
   `FmtFlag`). `AnnotateTypeExpr` is no longer a `TypedExpr` so it is never part
   of a type-checked expression (which makes sense).

 - type checking tests are extended to verify the serliazed type-checked
   expression, which verifies both serialization and type resolution.

Note that changing the prepend syntax to a `AnnotateTypeExpr` was considered
but that doesn't work for things like `DECIMAL '2.0'`.

Fixes cockroachdb#12532. Resolves some of the concerns in cockroachdb#12611.
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jan 14, 2017
This change addresses a few related problems (described in more detail
in cockroachdb#12532):
 - ambiguous datum types after serialization
 - invalid prefix casts after expression processing (like `INTERVAL INTERVAL `1.5s`').

Changes:

 - a new FmtFlag adds type annotations to datums and placeholders. This ensures
   that expressions that contain `DDInterval`, `DDate` can be formatted and
   re-parsed correctly. This replaces ad-hoc type prepending like
   `INTERVAL '..'` or `DATE '..'`.

 - `CastExpr.TypeCheck` now returns the underlying type-checked expression
   directly when it is a Datum that we know for sure is of identical type.

 - `castPrefix` (renamed to `castPrepend`) now only functions if the
   underlying expression is a string literal (preventing incorrect expressions
   like `DECIMAL '1.0':::STRING`). `castPrefixParens` is removed (it
   was incorrect).

 - `AnnotateTypeExpr.TypeCheck` now returns the underlying type-checked
   expression rather than changing the `AnnotateTypeExpr` in-place and
   returning it (which could result in double annotations with the new
   `FmtFlag`). `AnnotateTypeExpr` is no longer a `TypedExpr` so it is never part
   of a type-checked expression (which makes sense).

 - type checking tests are extended to verify the serialized type-checked
   expression, which verifies both serialization and type resolution.

Note that changing the prepend syntax to a `AnnotateTypeExpr` was considered
but that doesn't work for things like `DECIMAL '2.0'`.

Fixes cockroachdb#12532. Helps with cockroachdb#12611.
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jan 18, 2017
This change addresses a few related problems (described in more detail
in cockroachdb#12532):
 - ambiguous datum types after serialization
 - invalid prefix casts after expression processing (like `INTERVAL INTERVAL `1.5s`').

Changes:

 - a new FmtFlag adds type annotations to datums and placeholders. This ensures
   that expressions that contain `DDInterval`, `DDate` can be formatted and
   re-parsed correctly. This replaces ad-hoc type prepending like
   `INTERVAL '..'` or `DATE '..'`.

 - `CastExpr.TypeCheck` now returns the underlying type-checked expression
   directly when it is a Datum that we know for sure is of identical type.

 - `castPrefix` (renamed to `castPrepend`) now only functions if the
   underlying expression is a string literal (preventing incorrect expressions
   like `DECIMAL '1.0':::STRING`). `castPrefixParens` is removed (it
   was incorrect).

 - `AnnotateTypeExpr.TypeCheck` now returns the underlying type-checked
   expression rather than changing the `AnnotateTypeExpr` in-place and
   returning it (which could result in double annotations with the new
   `FmtFlag`). `AnnotateTypeExpr` is no longer a `TypedExpr` so it is never part
   of a type-checked expression (which makes sense).

 - type checking tests are extended to verify the serialized type-checked
   expression, which verifies both serialization and type resolution.

Note that changing the prepend syntax to a `AnnotateTypeExpr` was considered
but that doesn't work for things like `DECIMAL '2.0'`.

Fixes cockroachdb#12532. Helps with cockroachdb#12611.
@vivekmenezes vivekmenezes assigned knz and unassigned justinj Apr 18, 2017
@knz
Copy link
Contributor Author

knz commented May 2, 2017

This needs #15388 which I bumped to 1.1, so bumping this one to 1.1 as well.

This needs to be documented as a known limitation though.

@knz knz modified the milestones: 1.1, 1.0 May 2, 2017
@knz knz added the docs-todo label May 2, 2017
@cuongdo
Copy link
Contributor

cuongdo commented Jul 7, 2017

@knz how is this looking for 1.1?

@knz
Copy link
Contributor Author

knz commented Aug 30, 2017

It's queued after #17501 merges. I think once #17501 is in I can cook a small fix.

@jordanlewis
Copy link
Member

Not making it into 1.1.0. Under consideration for inclusion in a subsequent minor release along with several other view bugfix PRs.

@petermattis petermattis modified the milestones: 1.1, 2.0 Jan 19, 2018
@vivekmenezes vivekmenezes modified the milestones: 2.0, 2.1 Jan 22, 2018
@knz knz added A-bulkio-schema-changes A-sql-typing SQLtype inference, typing rules, type compatibility. labels May 9, 2018
@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. labels Jun 5, 2018
@knz knz modified the milestones: 2.1, 2.2 Oct 15, 2018
@knz knz removed their assignment Aug 26, 2019
@jordanlewis
Copy link
Member

@rohany this one too. @RaduBerinde, just noticed you triaged this in the Optimizer board - I don't think you were planning to deal with it soon though, right?

@RaduBerinde
Copy link
Member

No, I had just triaged all opt issues.

@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2023
@RaduBerinde RaduBerinde reopened this Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes A-sql-typing SQLtype inference, typing rules, type compatibility. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. docs-todo S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Projects
None yet
Development

No branches or pull requests

8 participants