-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(engine): Add first stage of physical query planning #16891
Conversation
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
rather than properties Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
so it can be used by the planner Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
The planning of the physical plan happens in two stages: 1. Convert instructions of the logical plan into nodes of the physical plan. Most instructions translate to a single node. 2. Push down filter predicates and limits. Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
616eac9
to
d333d8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the discussions offline, I think this is quite clean!
BinaryOpEq // Equality comparison (==). | ||
BinaryOpNeq // Inequality comparison (!=). | ||
BinaryOpGt // Greater than comparison (>). | ||
BinaryOpGte // Greater than or equal comparison (>=). | ||
BinaryOpLt // Less than comparison (<). | ||
BinaryOpLte // Less than or equal comparison (<=). | ||
BinaryOpAnd // Logical AND operation (&&). | ||
BinaryOpOr // Logical OR operation (||). | ||
BinaryOpXor // Logical XOR operation (^). | ||
BinaryOpNot // Logicaal NOT operation (!). | ||
BinaryOpAdd // Addition operation (+). | ||
BinaryOpSub // Subtraction operation (-). | ||
BinaryOpMul // Multiplication operation (*). | ||
BinaryOpDiv // Division operation (/). | ||
BinaryOpMod // Modulo operation (%). | ||
BinaryOpMatchStr // String matching comparision (|=). | ||
BinaryOpNotMatchStr // String not-matching comparison (!=). | ||
BinaryOpMatchRe // Regex matching comparison (=~). | ||
BinaryOpNotMatchRe // Regex not-matching comparison (!~). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be replaced with types.BinOpKind
? Same comment for UnaryOpType
.
for _, inst := range lp.Instructions { | ||
switch inst := inst.(type) { | ||
case *logical.Return: | ||
nodes, err := p.process(inst.Value) | ||
if err != nil { | ||
return err | ||
} | ||
if len(nodes) > 1 { | ||
return errors.New("plan has more than 1 return value") | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this behave properly if the logical plan incorrectly has two return instructions? Anything after the first return should be ignored
// Plan returns the built physical [Plan]. Requires to run [Plan.Build] first, | ||
// otherwise it will return nil. | ||
func (p *Planner) Plan() *Plan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api design nit: Should Plan.Build
return the plan so you don't have to have a separate getter?
func (p *Planner) processSelect(_ *logical.Select) error { | ||
return nil | ||
// Convert a stream selector from the [logical.MakeTable] instruction into an [Expression]. | ||
func (p *Planner) convertSelector(inst logical.Value) Expression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, convertSelector and convertPredicate seem to do the same thing, minus convertPredicate having the unary op.
Should we roll them up into a single function to reduce duplication? That would also allow the caller to say that a unary op is invalid for a selector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, the UnaryOp in the selector is prevented by the LogQL syntax.
|
||
// Create builds a physical plan from a logical plan using the provided context. | ||
// It returns the built plan or an error if the build process fails. | ||
func Create(plan *logical.Plan, ctx *Context) (*Plan, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api nit: should Context go before plan? I think idiomatically, parameters are typically ordered from most broad to most specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Or maybe we don't need the convenience function at all?)
@@ -0,0 +1,32 @@ | |||
package types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Go package naming usually discourages naming a package types like this:
Don’t use a single package for all your APIs. Many well-intentioned programmers put all the interfaces exposed by their program into a single package named
api
,types
, orinterfaces
, thinking it makes it easier to find the entry points to their code base. This is a mistake. Such packages suffer from the same problems as those named util or common, growing without bound, providing no guidance to users, accumulating dependencies, and colliding with other imports. Break them up, perhaps using directories to separate public packages from implementation.
I think it's fine for now since it's internal, but we should keep an eye out for opportunities to organize by domain. Some things like ColumnType
might be better for the schema package, but I'm not sure about the other types yet.
catalog Catalog | ||
plan *Plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity is the only planning context the catalog and will plan
be reused? I'm asking because all functions could be pure. Only processMakeTable
seems to need the Planner
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for now that's the case. The idea with plan
was to re-use it in the second stage of planning for the filter/limit pushdown. That could possibly also be done in pure functions, though.
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
What this PR does / why we need it:
Planning consists of two stages:
Instructions from the logical plan are converted into Nodes in the physical plan.
Most instructions can be translated into a single Node. However,
MakeTable
translates into multipleDataObjScan
nodes.a) Push down the limit of the
Limit
node to theDataObjScan
nodes.b) Push down filter predicates from the
Filter
nodes to theDataObjScan
nodes.This PR implements only the first stage.