Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.

Commit 4cac4f9

Browse files
authored
Limitations of Spark speculative execution (#1315)
1 parent bbd54db commit 4cac4f9

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

drivers/spark-connector-new.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -258,8 +258,6 @@ This operation is not atomic, therefore some documents could be successfully wri
258258
- `ignore`
259259
- `update` with `keep.null=true`
260260

261-
These configurations are also compatible with speculative execution of tasks.
262-
263261
A failing batch-saving request is retried once for every Coordinator. After that, if still failing, the write task for the related partition is aborted. According to the Spark configuration, the task can be retried and rescheduled on a different executor, if the provided write configuration allows idempotent requests (as described above).
264262

265263
If a task ultimately fails and is aborted, the entire write job will be aborted as well. Depending on the `SaveMode` configuration, the following cleanup operations will be performed:
@@ -280,7 +278,8 @@ When writing to an edge collection (`table.type=edge`), the schema of the Datafr
280278
- Batch writes are not performed atomically, so sometimes (i.e. in case of `overwrite.mode: conflict`) several documents in the batch may be written and others may return an exception (i.e. due to a conflicting key).
281279
- Writing records with the `_key` attribute is only allowed on collections sharded by `_key`.
282280
- In case of the `Append` save mode, failed jobs cannot be rolled back and the underlying data source may require manual cleanup.
283-
- Speculative execution of tasks would only work for idempotent write configurations. See [Write Resiliency](#write-resiliency) for more details.
281+
- Speculative execution of tasks only works for idempotent write configurations. See [Write Resiliency](#write-resiliency) for more details.
282+
- Speculative execution of tasks can cause concurrent writes to the same documents, resulting in write-write conflicts or lock timeouts
284283

285284
## Mapping Configuration
286285

0 commit comments

Comments
 (0)