Skip to content

Conversation

@adsharma
Copy link
Contributor

@adsharma adsharma commented Oct 14, 2025

More context in this blog post

# copy  https://github.com/adsharma/graph-std/tree/main/karate/karate_csr to ./karate_csr and then try

$ cat test.cypher
CREATE NODE TABLE Person(id INT64, club STRING, PRIMARY KEY(id)) WITH (storage = './karate_csr/karate_random');
CREATE REL TABLE knows(FROM Person TO Person, weight DOUBLE) WITH (storage = './karate_csr/karate_random');

echo "MATCH (u1)-[k]->(u2) RETURN u1.id, u1.club, k.weight, u2.id;"  | ./build/debug/tools/shell/lbug -i test.cypher
-- Processing: test.cypher
Opening the database under in-memory mode.
Enter ":help" for usage hints.
┌───────┬─────────┬──────────┬───────┐
│ u1.id │ u1.club │ k.weight │ u2.id │
│ INT64 │ STRING  │ DOUBLE   │ INT64 │
├───────┼─────────┼──────────┼───────┤
│ 333   │ Mr. Hi  │ 1.000000 │ 306   │
│ 333   │ Mr. Hi  │ 1.000000 │ 41    │
│ 333   │ Mr. Hi  │ 1.000000 │ 211   │
│ 333   │ Mr. Hi  │ 1.000000 │ 54    │
│ 333   │ Mr. Hi  │ 1.000000 │ 14    │
│ 302   │ Mr. Hi  │ 1.000000 │ 41    │
│ 268   │ Mr. Hi  │ 1.000000 │ 41    │
│ 306   │ Mr. Hi  │ 1.000000 │ 167   │
│ 75    │ Mr. Hi  │ 1.000000 │ 41    │
│ 75    │ Mr. Hi  │ 1.000000 │ 305   │
│  ·    │   ·     │    ·     │  ·    │
│  ·    │   ·     │    ·     │  ·    │
│  ·    │   ·     │    ·     │  ·    │
│ 167   │ Officer │ 1.000000 │ 41    │
│ 167   │ Officer │ 1.000000 │ 14    │
│ 87    │ Officer │ 1.000000 │ 268   │
│ 87    │ Officer │ 1.000000 │ 75    │
│ 87    │ Officer │ 1.000000 │ 54    │
│ 87    │ Officer │ 1.000000 │ 14    │
│ 87    │ Officer │ 1.000000 │ 161   │
│ 87    │ Officer │ 1.000000 │ 143   │
│ 87    │ Officer │ 1.000000 │ 126   │
│ 87    │ Officer │ 1.000000 │ 150   │
└───────┴─────────┴──────────┴───────┘

Copy link
Contributor Author

@adsharma adsharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • parquet_{node,rel}_table.{cpp,h} - these four files have the bulk of the implementation.

ScanTable::initLocalStateInternal(resultSet, context);
auto nodeIDVector = resultSet->getValueVector(opInfo.nodeIDPos).get();
scanState = std::make_unique<NodeTableScanState>(nodeIDVector, outVectors, nodeIDVector->state);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the parquet implementation hooks into the query pipeline for node tables.

auto nbrNodeIDVector = outVectors[0];
scanState = std::make_unique<RelTableScanState>(*MemoryManager::Get(*clientContext),
boundNodeIDVector, outVectors, nbrNodeIDVector->state);
// Check if this is a ParquetRelTable and create appropriate scan state
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the parquet implementation hooks into the query pipeline for rel tables.

@Vasilije1990
Copy link

Nice one!
We have built our own S3 loader. Is that something we can help with?

@adsharma adsharma mentioned this pull request Nov 1, 2025
@adsharma
Copy link
Contributor Author

@Vasilije1990 sorry I missed your comment. How's your s3 loader different from kuzu/ladybug's httpfs extension?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants