Skip to content

Update LDR docs for GA & remove validated #19626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

katmayb
Copy link
Contributor

@katmayb katmayb commented May 21, 2025

Fixes DOC-13440

This PR:

  • Removes the preview label from LDR in v25.2 to bring into GA.
  • Removes validated mode from all versions — for now, removed the mode option and immediate value because this is the default and only option.
  • Adds a description of how unique secondary indexes interact with LDR.

Preview

Navigate to pages from v25.2 Setup tutorial.

Copy link

netlify bot commented May 21, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit b433e3d
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/6838aadf03591400087d88de

Copy link

netlify bot commented May 21, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit b433e3d
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-api-docs/deploys/6838aadf3cf42500084e3bfe

Copy link

netlify bot commented May 21, 2025

Netlify Preview

Name Link
🔨 Latest commit b433e3d
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/6838aadf84f3cf00087a0d76
😎 Deploy Preview https://deploy-preview-19626--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@katmayb katmayb marked this pull request as ready for review May 22, 2025 15:01
@katmayb katmayb requested review from alicia-l2 and msbutler May 22, 2025 15:19
@@ -31,21 +29,16 @@ Conflicts at the KV level are detected when there is either:

### SQL level conflicts

In `validated` mode, when a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/constraints.md %}), for example, a foreign key constraint or schema constraint, it will be retried for up to a minute and then put in the [DLQ](#dead-letter-queue-dlq) if it could not be resolved.
When a conflict cannot apply due to violating [constraints]({% link {{ page.version.version }}/constraints.md %}), for example, a foreign key constraint or schema constraint, LDR will send the row to the [DLQ](#dead-letter-queue-dlq).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in immeidate mode we don't even allow foreign key constraints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this reference.


For more details, refer to the LDR [Known limitations]({% link {{ page.version.version }}/logical-data-replication-overview.md %}#known-limitations).

When you run LDR in [`immediate` mode](#modes), you cannot replicate a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). In [`validated` mode](#modes), foreign key constraints **must** match. All constraints are enforced at the time of SQL/application write.
#### Unique secondary indexes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great -- thank you!


### Dead letter queue (DLQ)

When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period, which could occur if:
When the LDR job starts, it will create a DLQ table with each replicating table so that unresolved conflicts can be tracked. The DLQ will contain the writes that LDR cannot apply after the retry period of a minute, which could occur if:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this list could be updated. I think it could read:

  • Presence of Unique index on destination table (see section below)
  • (only for 24.3) Loss of Quorum of the underlying ranges of the destination table

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also prevent DROP table on LDR tables on all versions given cockroachdb/cockroach#136172

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you were thinking for the list (24.3) below @msbutler? lmk if I mixed things up at all...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 24.3:

  • Presence of Unique index on destination table (see section below)
  • Loss of Quorum of the underlying ranges of the destination table

for 25.1/2

  • Presence of Unique index on destination table (see section below)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, did not see the updated commit. you can remove "table schemas do not match"

Copy link

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed current. looks good!

- The destination table was dropped.
- The destination cluster is unavailable.
- Tables schemas do not match.
- The destination table is unavailable.
Copy link

@msbutler msbutler May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove the unavailable line and the table schemas line for 25.1 and 25.2

@katmayb katmayb requested a review from msbutler May 23, 2025 17:55
@@ -39,7 +39,6 @@ When the LDR job starts, it will create a DLQ table with each replicating table

- The destination table is unavailable.
- [Loss of quorum]({% link {{ page.version.version }}/architecture/replication-layer.md %}#overview) of the underlying [ranges]({% link {{ page.version.version }}/architecture/reads-and-writes-overview.md %}#range) in the destination table.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof. sorry i should have caught this earlier. the first 2 bullets are essentially describing the same thing, so you could delete the first one.

@katmayb katmayb requested a review from rmloveland May 23, 2025 18:57
@katmayb katmayb removed the request for review from rmloveland May 27, 2025 15:54
@katmayb katmayb requested a review from alicia-l2 May 28, 2025 18:13
@katmayb
Copy link
Contributor Author

katmayb commented May 28, 2025

@alicia-l2 @jeffswenson can you check we landed where you wanted on the example?

Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just a few comments

When you run LDR in [`immediate` mode](#modes), you cannot replicate a table with [foreign key constraints]({% link {{ page.version.version }}/foreign-key.md %}). In [`validated` mode](#modes), foreign key constraints **must** match. All constraints are enforced at the time of SQL/application write.
#### Unique secondary indexes

LDR cannot guarantee that the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}) will remain empty if the destination table has a unique [secondary index]({% link {{ page.version.version }}/schema-design-indexes.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if there is a way to rephrase the first sentence? we can't guarantee that the DLQ will remain empty even in situations where the destination table does not have a unique secondary index. maybe we can phrase it from more of a guidance perspective like 'to avoid DLQ entries we recommend no unique secondary indexes on the destination table?'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rephrased the first sentence here, ok?

Copy link

@jeffswenson jeffswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@katmayb katmayb requested a review from rmloveland May 30, 2025 12:39
Copy link
Contributor

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

INSERT INTO city (100, nyc); -- timestamp 4
~~~

_Timestamp 5:_ Range containing primary key `1` on the destination cluster is unavailable for a few minutes due to a network partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First mention of range in this section could link to https://www.cockroachlabs.com/docs/v25.2/architecture/glossary.html#range

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


When the destination table includes unique [secondary indexes]({% link {{ page.version.version }}/schema-design-indexes.md %}), it can cause rows to enter the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other.

If the application modifies the same row in both clusters, LDR resolves the conflict using _last write wins_ (LWW) conflict resolution. [`UNIQUE` constraints]({% link {{ page.version.version }}/unique.md %}) are validated locally in each cluster, therefore if a replicated write violates a `UNIQUE` constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think earlier you wrote that the LWW is based on MVCC timestamp - if you said that here as well a good reference link for MVCC is: https://www.cockroachlabs.com/docs/v25.2/architecture/storage-layer.html#mvcc

we also expose a crdb_internal_mvcc_timestamp but it's not really "officially supported" AFAICT since it uses the crdb_internal_* prefix

INSERT INTO city (100, nyc); -- timestamp 4
~~~

_Timestamp 5:_ Range containing primary key `1` on the destination cluster is unavailable for a few minutes due to a network partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments as above re: links to MVCC, network partition, range glossary definition, etc. (non-blocking suggestions only!)

@@ -87,10 +87,6 @@ The [`VECTOR`]({% link {{ page.version.version }}/vector.md %}) data type stores

[Organizing CockroachDB {{ site.data.products.cloud }} clusters using folders]({% link cockroachcloud/folders.md %}) is in preview. Folders allow you to organize and manage access to your clusters according to your organization's requirements. For example, you can create top-level folders for each business unit in your organization, and within those folders, organize clusters by geographic location and then by level of maturity, such as production, staging, and testing.

### Logical data replication (LDR) for CockroachDB {{ site.data.products.core }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KILL IT WITH FIRE


When the destination table includes unique [secondary indexes]({% link {{ page.version.version }}/schema-design-indexes.md %}), it can cause rows to enter the [_dead letter queue_ (DLQ)]({% link {{ page.version.version }}/manage-logical-data-replication.md %}). The two clusters in LDR operate independently, so writes to one cluster can conflict with writes to the other.

If the application modifies the same row in both clusters, LDR resolves the conflict using _last write wins_ (LWW) conflict resolution. [`UNIQUE` constraints]({% link {{ page.version.version }}/unique.md %}) are validated locally in each cluster, therefore if a replicated write violates a `UNIQUE` constraint on the destination cluster (possibly because a conflicting write was already applied to the row) the replicating row will be applied to the DLQ.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same suggestions as above re: sprinkling a few links re: range, MVCC, network partition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants