-
Notifications
You must be signed in to change notification settings - Fork 0
feat: JSON indexing for EQL V2 #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
freshtonic
wants to merge
34
commits into
main
Choose a base branch
from
feat/json-indexing-eql-v2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
790df49
to
f8ab910
Compare
0c01019
to
ee91aaa
Compare
The type system requires associated types to represent EQL terms used only with particular operators and functions. The kinds of encrypted AST literals (`Value` nodes) values that can be communicated as a result of an `eql-mapper` type check will be used by Proxy to inform *how* those terms are to be encrypted. This commit also introduces `EqlTrait`. In `eql-mapper` `EqlTrait` is Rust enum but each variant respresents the name of a notional trait that EQL types can implement. These are the currently defined EQL traits: `Eq`, `Ord`, `TokenMatch`, `JsonLike` & `Contain`. Associated types belong to an `EqlTrait` but are implemented on types that implement that trait (just like in Rust). For example the `EqlTrait::JsonLike` has `Accessor` and `Path` associated types. An `Accessor` is the type of the expression the right hand side of `->` or `->>`. A `Path` is the type of the second argument to `jsonb_query_path`. An associated type is fundamentally related to its parent type in that it shares configuration - but cannot be used in a position where the parent type is mandated. Lastly, `Type::Var` now has "type bounds". Type bounds are represented as an `EqlTraits` value (not the pluralisation) which is just a set of boolean flags: one per `EqlTrait`.
The current implementation of function type checking in `eql-mapper` manually instantiates and unifies types for a handful of specific SQL functions (`MIN`, `MAX`, `COUNT`) and operators (`=`, `<`, `<=`, `>=`, `>`). The list of functions and operators is about to grow significantly and implementing type checking by hand for all of those by manually instantiating and unifying type variables is going to be tedious and error prone. This commit introduces `TypeDecl`, `TypeEnv` and `InstantiatedTypeEnv`. A `TypeDecl` is a type declaration for the `eql-mapper` type system. A `TypeDecl` is purely symbollic and cannot be used by the `Unifier`. What a `TypeDecl` does is achieve a consistent notation for declaring types and their bounds. Multiple type declarations can be put into a `TypeEnv` - a `TypeEnv` is a scope for some related type declarations. Type declarations can reference other type declarations in the same type env via type variables `TVar`s. A `TypeEnv` can be *instantiated*, returning an `InstantiatedTypeEnv`. In an `InstantiatedTypeEnv` all of the `TypeDecls` have been converted into `Arc<Type>` values ready to be used by the `Unifier`. Additionally, the `InstantiatedTypeEnv` allows retrieval of its types via a `TVar` - it maintains the same associated of `TVar` to type as the original `TypeEnv`.
Added crate `eql-mapper-macros`. The crate exposes functions for building `TypeDecl` and `Arc<Type>` values without boilerplate. These macros are great for testing - previously, defining complex types like projections required a lot of code to build the `Type` enums. Now those type definitions are backed up by a DSL.
An associated type represents "deferred" unification. This is a bit more involved than unifying other types. The reason is that the `impl_ty` (parent type) of the associated type can be a type variable bounded by the appropriate `EqlTrait` that declares the associated type. At the time the `Type::Associated(AssociateType)` is instantiated by the `TypeInferencer` we cannot guarantee that the parent type has been resolved - so we can resolve the conrete associated type. So you can think of the associated type as an "obligation" (that's the actual term used in type theory). Whenever unification tries to unify a type with an associated type obligation, it first checks if the parent type of the associated type has been resolved and then resolves the associated type. If the parent type has not been resolved, then the *unresolved* associated type is unified with the other type as a new `Type::Associated`, deferring resolution until later. `Type::Var` now also supports bounds (as an `EqlTraits` struct). Two type variables always unify: even they have different bounds. The result is a new type variable with merged bounds. Whenever a concrete type is unified with a type variable, the effective bounds of the concrete type MUST contain all of the bounds of the type variable (concrete types always have *implied* effective bounds). Thr concrete type is allowed to define additional bounds in excess of those required by the type variable.
Using the macros defined earlier in this PR, define the SQL binary operators and SQL functions that must support EQL types.
- extract "resolve_type" functions into into a ResolveType trait - delete old SQL function and operator macros - some cosmetic renaming
Author: Toby Hede <toby@cipherstash.com>
This enables correct typing of `jsonb_array_elements` and `json_array_elements_text`. `SETOF` is used as the return type for functions that return multiple rows. A `SETOF` type is similar to a projection type except for a few minor differences: - In a projection, columns can be anonymous or aliased and aliases do not even have to be unique. - In a SETOF with multiple columns all columns must have a name and that name must be unique (with the exception a SETOF consisting of a single column does not have to alias the column).
The rename logic now checks if any argument type OR return type is EQL.
`Type::Value` now represents any SQL expression and an artificial dichotomy has been removed which simplifies a bunch of code.
`Projection` is now defined simply as ```rust struct Projection(Vec<ProjectionColumn>); ``` This simplifies type unification because no longer does the `Projection::Empty` case have to be handled as a special case. The projections are now auto-flattened during conversion to the exported `eql_mapper::Type` representation.
Also added a README.
This means cipherstash-proxy & cipherstash-proxy-integration are now sharing the same dependency version.
3e661a1
to
ccdec51
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support additional EQL types for representing JSON indexing terms in the
eql-mapper
and presentation to theproxy
as the result of a successful type check.The new JSON indexing types required adding associated type support to the type system and
Unifier
.For more detail, see the commit messages in this PR.
TODO: proxy glue code (partially implemented but the
proxy
package currently does not build).Acknowledgment
By submitting this pull request, I confirm that CipherStash can use, modify, copy, and redistribute this contribution, under the terms of CipherStash's choice.