generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 181
Open
Labels
PPLPiped processing languagePiped processing language
Description
Problem Statement
We can functionally implement a PPL graphLookup command that is semantically equivalent to MongoDB’s $graphLookup pipeline aggregation query, which is to demonstrate multi-hop queries based on a graph model.
Current State
Lack of multi-hop ability in query
Long-Term Goals
- Describe the ideal outcome or future state you aim to achieve.
- A new PPL command to support multi-hop query with high performance.
- What are the primary objectives of this solution should fulfill in the long run?
- Support GraphRag query with PPL
Proposal
PPL syntax:
source = <vertex_index>
| graphlookup
<edge_index>
startWith=<expression>
fromField=<field>
toField=<field>
[maxDepth=<number>]
[match=<expression>]
as <new_field>
Approach
Approach 1
Implement with Calcite Recursive CTE: POC here
┌──────────────┐
│ Graph G=(V,E)│
└──────┬───────┘
│ represented as
▼
┌──────────────┐
│edges(src,dst)│
└──────┬───────┘
│
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌───────────┐ ┌───────────┐
│ JOIN │ │ UNION ALL │ │ RECURSIVE │
│=traverse│ │=accumulate│ │= iterate │
│one edge │ │ │ │ │
└─────────┘ └───────────┘ └───────────┘
│ │ │
└───────────────┼───────────────┘
│
▼
┌───────────────┐
│ Graph Search │
│ Result │
│(reachable set)│
└───────────────┘
Approach 2 (perferred)
Provide a native Enumerable operator to implement high performance multi-hop traverse.
Metadata
Metadata
Assignees
Labels
PPLPiped processing languagePiped processing language
Type
Projects
Status
Not Started