Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion astro.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ export default defineConfig({
items: [
{ label: 'Overview', link: '/tutorials' },
{ label: 'Cypher', link: '/tutorials/cypher' },
{ label: 'Python', link: '/tutorials#python' },
{ label: 'Python', link: '/tutorials/python' },
{ label: 'Rust', link: '/tutorials/rust' },
]
},
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/client-apis/cli.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ Consider that you have the following DDL file:

```cypher
// schema.cypher
CREATE NODE TABLE Person (name STRING, age INT64, PRIMARY KEY(name));
CREATE NODE TABLE Person (name STRING PRIMARY KEY, age INT64);
COPY Person FROM 'person.csv';
```

Expand Down Expand Up @@ -250,7 +250,7 @@ To change the output mode, use the `:mode` command followed by the desired mode:
```cypher
kuzu> :mode json
mode set as json
kuzu> CREATE NODE TABLE Person (name STRING, age INT64, PRIMARY KEY(name));
kuzu> CREATE NODE TABLE Person (name STRING PRIMARY KEY, age INT64);
[{"result":"Table Person has been created."}]
```

Expand Down
2 changes: 1 addition & 1 deletion src/content/docs/client-apis/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: "Client APIs"

import { LinkCard, CardGrid} from '@astrojs/starlight/components';

Kuzu is embeddable in a variety of languages via client library APIs. Queries in Kuzu through via
Kuzu is embeddable in a variety of languages via client library APIs. Queries in Kuzu through
its CLI or client APIs are transactional, satisfying serializability, atomicity and durability requirements.

## Command line shell
Expand Down
8 changes: 4 additions & 4 deletions src/content/docs/client-apis/nodejs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ const kuzu = require("kuzu");
const conn = new kuzu.Connection(db);

// Create the tables
await conn.query("CREATE NODE TABLE User(name STRING, age INT64, PRIMARY KEY (name))");
await conn.query("CREATE NODE TABLE City(name STRING, population INT64, PRIMARY KEY (name))");
await conn.query("CREATE NODE TABLE User(name STRING PRIMARY KEY, age INT64)");
await conn.query("CREATE NODE TABLE City(name STRING PRIMARY KEY, population INT64)");
await conn.query("CREATE REL TABLE Follows(FROM User TO User, since INT64)");
await conn.query("CREATE REL TABLE LivesIn(FROM User TO City)");

Expand Down Expand Up @@ -71,8 +71,8 @@ const db = new kuzu.Database("./demo_db");
const conn = new kuzu.Connection(db);

// Create the tables
conn.querySync("CREATE NODE TABLE User(name STRING, age INT64, PRIMARY KEY (name))");
conn.querySync("CREATE NODE TABLE City(name STRING, population INT64, PRIMARY KEY (name))");
conn.querySync("CREATE NODE TABLE User(name STRING PRIMARY KEY, age INT64)");
conn.querySync("CREATE NODE TABLE City(name STRING PRIMARY KEY, population INT64)");
conn.querySync("CREATE REL TABLE Follows(FROM User TO User, since INT64)");
conn.querySync("CREATE REL TABLE LivesIn(FROM User TO City)");

Expand Down
26 changes: 13 additions & 13 deletions src/content/docs/client-apis/python.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ def main() -> None:
conn = kuzu.Connection(db)

# Create schema
conn.execute("CREATE NODE TABLE User(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE City(name STRING, population INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE User(name STRING PRIMARY KEY, age INT64)")
conn.execute("CREATE NODE TABLE City(name STRING PRIMARY KEY, population INT64)")
conn.execute("CREATE REL TABLE Follows(FROM User TO User, since INT64)")
conn.execute("CREATE REL TABLE LivesIn(FROM User TO City)")

Expand Down Expand Up @@ -71,7 +71,7 @@ import kuzu
shutil.rmtree("test_db", ignore_errors=True)
db = kuzu.Database("test_db")
# Create the async connection
# The undelying connection pool will be automatically created and managed by the async connection
# The underlying connection pool will be automatically created and managed by the async connection
conn = kuzu.AsyncConnection(db, max_concurrent_queries=4)

async def create_tables(conn: kuzu.AsyncConnection) -> None:
Expand All @@ -90,7 +90,7 @@ async def copy_data(conn: kuzu.AsyncConnection) -> None:

async def query_1(conn: kuzu.AsyncConnection) -> None:
result = await conn.execute("MATCH (u:User)-[:LivesIn]->(c:City) RETURN u.*")
for row in response:
for row in result:
print(row)

async def main():
Expand Down Expand Up @@ -181,7 +181,7 @@ import pandas as pd
db = kuzu.Database("tmp")
conn = kuzu.Connection(db)

conn.execute("CREATE NODE TABLE Person(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE Person(name STRING PRIMARY KEY, age INT64)")
conn.execute("CREATE (a:Person {name: 'Adam', age: 30})")
conn.execute("CREATE (a:Person {name: 'Karissa', age: 40})")
conn.execute("CREATE (a:Person {name: 'Zhang', age: 50})")
Expand Down Expand Up @@ -222,7 +222,7 @@ import polars as pl
db = kuzu.Database("tmp")
conn = kuzu.Connection(db)

conn.execute("CREATE NODE TABLE Person(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE Person(name STRING PRIMARY KEY, age INT64)")
conn.execute("CREATE (a:Person {name: 'Adam', age: 30})")
conn.execute("CREATE (a:Person {name: 'Karissa', age: 40})")
conn.execute("CREATE (a:Person {name: 'Zhang', age: 50})")
Expand Down Expand Up @@ -275,7 +275,7 @@ import pyarrow as pa
db = kuzu.Database("tmp")
conn = kuzu.Connection(db)

conn.execute("CREATE NODE TABLE Person(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE Person(name STRING PRIMARY KEY, age INT64)")
conn.execute("CREATE (a:Person {name: 'Adam', age: 30})")
conn.execute("CREATE (a:Person {name: 'Karissa', age: 40})")
conn.execute("CREATE (a:Person {name: 'Zhang', age: 50})")
Expand Down Expand Up @@ -354,8 +354,8 @@ shape: (3, 2)
│ str ┆ i64 │
╞═════════╪═════╡
│ Adam ┆ 30 │
│ Karissa ┆ 25
│ Zhang ┆ 20
│ Karissa ┆ 40
│ Zhang ┆ 50
└─────────┴─────┘
```
</TabItem>
Expand Down Expand Up @@ -405,7 +405,7 @@ import pandas as pd
db = kuzu.Database("tmp")
conn = kuzu.Connection(db)

conn.execute("CREATE NODE TABLE Person(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE Person(name STRING PRIMARY KEY, age INT64)")

df = pd.DataFrame({
"name": ["Adam", "Karissa", "Zhang"],
Expand Down Expand Up @@ -437,7 +437,7 @@ import polars as pl
db = kuzu.Database("tmp")
conn = kuzu.Connection(db)

conn.execute("CREATE NODE TABLE Person(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE Person(name STRING PRIMARY KEY, age INT64)")

df = pl.DataFrame({
"name": ["Adam", "Karissa", "Zhang"],
Expand Down Expand Up @@ -475,7 +475,7 @@ import pyarrow as pa
db = kuzu.Database("tmp")
conn = kuzu.Connection(db)

conn.execute("CREATE NODE TABLE Person(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE Person(name STRING PRIMARY KEY, age INT64)")

tbl = pa.table({
"name": ["Adam", "Karissa", "Zhang"],
Expand Down Expand Up @@ -589,7 +589,7 @@ Once the UDF is registered, you can apply it in a Cypher query. First, let's cre

```py
# create a table
conn.execute("CREATE NODE TABLE IF NOT EXISTS Item (id INT64, a INT64, b INT64, c INT64, PRIMARY KEY(id))")
conn.execute("CREATE NODE TABLE IF NOT EXISTS Item (id INT64 PRIMARY KEY, a INT64, b INT64, c INT64)")

# insert some data
conn.execute("CREATE (i:Item {id: 1}) SET i.a = 134, i.b = 123")
Expand Down
50 changes: 25 additions & 25 deletions src/content/docs/concurrency.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,29 +16,29 @@ When operating under on-disk mode, your data and the underlying database files a
local database directory, whereas under [in-memory](/get-started#in-memory-database) mode,
no data is persisted to disk.

Throughout this documentation, lets suppose you open a Kuzu database that's on-disk, and whose files
Throughout this documentation, let's suppose you open a Kuzu database that's on-disk, and whose files
are in a directory named `./kuzu-db-dir`.

## Understand connections

### Database and connection objects
Application processes must connect to a Kuzu database in 2 steps before they can start querying it:
Application processes must connect to a Kuzu database in two steps before they can start querying it:

**Step 1.** Create an instance of a `Database` object `db` and pass it the database directory (`./kuzu-db-dir` in our example below), and
a read-write mode which can be either:
1. `READ_WRITE` (default); or
2. `READ_ONLY`

**Step 2.** Creating a `Connection` object `conn` from the Database object `db`.
**Step 2.** Create a `Connection` object `conn` from the Database object `db`.

- A Connection object that was created using a `READ_WRITE` Database object can execute queries that
do both read (e.g., queries with `MATCH WHERE RETURN` statements) as well as write operations
do both read (e.g., queries with `MATCH WHERE RETURN` statements) as well as write operations
(e.g., queries with `CREATE` or `COPY FROM` statements).
- In contrast, a Connection object that was created using a `READ_ONLY` Database can only execute
queries that do read operations.

Then, using `conn`, one can execute Cypher queries against the database stored under `./kuzu-db-dir`.
Here's a simple example application in Python that demonstrates these 2 steps for creating a `READ_WRITE`
Here's a simple example application in Python that demonstrates these two steps for creating a `READ_WRITE`
database and a connection. The same principles apply to other language APIs as well:

```python
Expand All @@ -61,13 +61,13 @@ When working with in-memory databases, there are a few restrictions to keep in m
## Understand concurrency

### Limitations of creating multiple Database objects
Kuzu is an embedded database, i.e., it is a library you embed inside an application process and runs as part
Kuzu is an embedded database, i.e., it is a library you embed inside an application process and run as part
of this application process, instead of a separate process.
You can think of the Database object as the Kuzu database software.
Specifically, the Database object contains
different components of the Kuzu database software, such as its buffer manager, storage manager, transaction manager etc.
different components of the Kuzu database software, such as its buffer manager, storage manager, transaction manager, etc.
Several of the components inside a Database object, such as the buffer manager,
caches parts of the data that is stored on disk. This limits the number of Database objects that can be created
cache parts of the data that are stored on disk. This limits the number of Database objects that can be created
pointing to the same database directory, either in the same process or across multiple processes.

The possible settings are:
Expand All @@ -76,16 +76,16 @@ The possible settings are:

:::caution[Note]
The core idea related to concurrency is this: you cannot have a `READ_WRITE` Database object `db1`
and a separate `READ_ONLY` or `READ_WRITE` Database object `db2`, and also concurrently query the same
and a separate `READ_ONLY` or `READ_WRITE` Database object `db2`, and also concurrently query the same
database through connections from both `db1` and `db2`. This is not safe.
:::

The reason for this limitation is that if a connection `conn1` from `db1` makes a
write operation, say deleting some node record, then the`db1` object is able to ensure
write operation, say deleting some node record, then the `db1` object is able to ensure
that any cached data in `db1` is refreshed and is accurate. However, it cannot notify other Database objects that may exist
about the change. So in our example, `db2`'s cache would no longer represent the true state of the
data on disk that was cached. This can lead to problems if
connections from `db2` try to run queries after db1's modification. Therefore, Kuzu will
connections from `db2` try to run queries after `db1`'s modification. Therefore, Kuzu will
not allow multiple Database objects to be created unless they are all `READ_ONLY`.

The limitation of having either one `READ_WRITE` Database object or multiple `READ_ONLY` Database objects applies
Expand All @@ -94,18 +94,18 @@ creating multiple Database instances within the same process (you should instead
in that process).

However, there are common scenarios when you may want to launch
multiple application processes that connect to the same database directory. Once such scenario
is when developing your workflow in Python using a Jupyter notebook,
multiple application processes that connect to the same database directory. One such scenario
is when developing your workflow in Python using a Jupyter notebook
that connects to `./kuzu-db-dir`. Say you want to also run the Kuzu CLI alongside your Jupyter notebook,
which also connects to the same `./kuzu-db-dir`. When you launch Kuzu CLI and point it to
`./kuzu-db-dir`, Kuzu CLI embeds Kuzu and tries to creates a `READ_WRITE` Database object. So if your notebook process already
`./kuzu-db-dir`, Kuzu CLI embeds Kuzu and tries to create a `READ_WRITE` Database object. So if your notebook process already
has created a Database object, this will fail with an error that looks like this:

```
IO exception: Could not set lock on file : ./kuzu-db-dir/.lock
```

If this happens, would have to shut down your notebook process (or simply restart your Jupyter server),
If this happens, you would have to shut down your notebook process (or simply restart your Jupyter server),
so that its Database object is destroyed, before the CLI can run.

### Create multiple Connections from the same Database object
Expand All @@ -114,7 +114,7 @@ Note that the above limitation about creating multiple Database objects does not
multiple Connections from the same `READ_WRITE` Database object and issue concurrent queries. For example,
you can write a program that creates a single `READ_WRITE` Database object `db` that points to `./kuzu-db-dir`.
Then, you can spawn multiple threads
T<sub>1</sub>, ..., T<sub>k</sub>, and each T<sub>i</sub> obtains a connection from `db` and concurrently issue
T<sub>1</sub>, ..., T<sub>k</sub>, and each T<sub>i</sub> obtains a connection from `db` and concurrently issues
read or write queries. This is safe. Every read and write statement in Kuzu is wrapped around a transaction
(either automatically or manually by you). Concurrent transactions that operate on the same database
`./kuzu-db-dir` are safely executed by Kuzu's transaction manager (i.e., the transaction manager inside `db`),
Expand All @@ -137,11 +137,11 @@ For simplicity, in the above image queries
from `conn1` and `conn2` are executed sequentially but they could be running concurrently as well.

### Scenario 2: Multiple processes that create `READ_ONLY` databases
In this scenario, you have multiple applications process that embed
In this scenario, you have multiple application processes that embed
Kuzu and create `READ_ONLY` Database objects that open the same database directory `./kuzu-db-dir`.
Each process can create multiple concurrent connections and issue queries.
However, each connection can only execute read-only queries (because the database is opened in `READ_ONLY` mode).
Since the connections and queries are read-only none of the queries can change the actual database files on disk.
Since the connections and queries are read-only, none of the queries can change the actual database files on disk.
Therefore, even though the queries are coming
from connections from different Database objects, this is safe and allowed.

Expand Down Expand Up @@ -185,8 +185,8 @@ locking mechanism. However, there is a known issue that Kuzu Explorer is not
able to see the flags put by other processes. The core problem is that Explorer runs as a Docker container
and the flags are not propagated between the host operating system and the Docker environment. We do not currently
have a fix to this (do [contact us](mailto:contact@kuzudb.com) if you know of an easy solution). So if you have a process (or processes) that has
opened a Database directory and yonpmu concurrently start Kuzu Explorer, you should manually ensure that
either: (i) both Explorer and your other process are in `READ_ONLY` mode; or (ii) you shut down your other
opened a Database directory and you concurrently start Kuzu Explorer, you should manually ensure that
either: (i) both Explorer and your other process are in `READ_ONLY` mode; or (ii) you shut down your other
process first before opening Explorer in `READ_WRITE` mode.

## FAQs
Expand All @@ -196,12 +196,12 @@ In this section, we address some commonly asked questions related to concurrency
##### Can I embed Kuzu using both `READ_ONLY` and `READ_WRITE` processes in my application?

No, when embedding Kuzu in your application, you cannot have both `READ_WRITE` and `READ_ONLY` database processes
open at any given time (in a safe manner). Technical details for this limitation are described the the sections above.
open at any given time (in a safe manner). Technical details for this limitation are described in the sections above.

In short, the reason for this limitation is that at any given time, a `READ_WRITE` process can make changes
to the disk layout, which may or may not be reflected in the buffer manager of other open `READ_ONLY` connections, and this
can lead to inconsistencies or data corruption. To avoid this issue, the best practice when embedding Kuzu in your
application is to use design patterns as per one of the scenarios shown pictorially, in the sections above.
application is to use design patterns as per one of the scenarios shown pictorially in the sections above.

##### I'm seeing an error related to lock files when running Kuzu in a Jupyter notebook. How can I resolve this?

Expand All @@ -212,7 +212,7 @@ open other processes that connect to the same database directory, you may come a
IO exception: Could not set lock on file : ./db_directory/.lock
```

The `.lock` file, as described in earlier sections in this page, is present to protect you from inadvertent
The `.lock` file, as described in earlier sections on this page, is present to protect you from inadvertent
data corruption due to multiple Database instances trying to access the same database directory concurrently.
To resolve this, simply click the `Restart server` button in your Jupyter notebook (or close the Jupyter
notebook entirely). Restarting the Jupyter notebook server (or closing it) will release the `.lock` file
Expand All @@ -230,8 +230,8 @@ in mind:
- Whether you will read and write to the database or only read from it

An in-memory database is stored in memory and not on disk. This means that the database is temporary
and the data will be lost when the process that created the database is terminated, so from a working
level perspective, in-memory databases require a `READ_WRITE` process.
and the data will be lost when the process that created the database is terminated, so without a `READ_WRITE` process,
in-memory databases won't have any data to operate on.

See the [getting started](/get-started#in-memory-database) section for more details on how to create
and work with in-memory databases in your client API of choice.
4 changes: 2 additions & 2 deletions src/content/docs/cypher/attach.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
title: Attach/Detach to External Databases
---
Using the `ATTACH` statement, you can connect to external Kuzu databases as well as several relational DBMSs.
These directories or files of these external databases can be both local or in a remote file system. Here is a simple
These directories or files of external databases can be either local or in a remote file system. Here is a simple
example. Suppose you are in the Kuzu CLI and have opened a database under local directory `/uw`. In the middle of this
session, you want to query another local Kuzu database, say `/work`, which suppose has some `Manager` node table.
session, you want to query another local Kuzu database, say `/work`, which supposedly has some `Manager` node table.
You can attach to the `/work` database and query the `Manager` nodes in it and then detach as follows:

```
Expand Down
4 changes: 2 additions & 2 deletions src/content/docs/cypher/data-definition/alter.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,9 @@ ALTER TABLE User DROP age;

## Drop column if exists

If the given column name does not exists in the table, Kuzu throws an exception when you try to drop it.
If the given column name does not exist in the table, Kuzu throws an exception when you try to drop it.
To avoid the exception being raised, use the `IF EXISTS` clause. This tells Kuzu to do nothing when
the given column name does not exists in the table.
the given column name does not exist in the table.

Example:
```sql
Expand Down
Loading