Skip to content

Conversation

@rozza
Copy link
Member

@rozza rozza commented Nov 26, 2025

Requires unordered inserts otherwise this configuration is ignored

SPARK-451

Requires unordered inserts otherwise this configuration is ignored

SPARK-451
@rozza rozza requested a review from Copilot November 26, 2025 14:05
@rozza rozza requested a review from a team as a code owner November 26, 2025 14:05
@rozza rozza requested a review from stIncMale November 26, 2025 14:05
Copilot finished reviewing on behalf of rozza November 26, 2025 14:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for ignoring duplicate key errors during insert operations through a new ignoreDuplicatesOnInsert configuration option. This feature requires unordered bulk inserts to function correctly; otherwise, the configuration is ignored with a warning logged.

  • Adds new configuration ignoreDuplicatesOnInsert in WriteConfig with validation logic
  • Implements exception handling in MongoDataWriter to catch and ignore duplicate key errors (error code 11000) when enabled
  • Includes comprehensive unit and integration tests covering various scenarios

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/main/java/com/mongodb/spark/sql/connector/config/WriteConfig.java Adds IGNORE_DUPLICATES_ON_INSERT_CONFIG constant, default value, and ignoreDuplicatesOnInsert() method with validation logic that ensures unordered inserts and INSERT operation type
src/main/java/com/mongodb/spark/sql/connector/write/MongoDataWriter.java Adds exception handling to catch MongoBulkWriteException and ignore it when all errors are duplicate key errors (code 11000) and the feature is enabled
src/test/java/com/mongodb/spark/sql/connector/config/MongoConfigTest.java Adds unit test testWriteConfigIgnoreDuplicatesOnInsert() to verify configuration behavior under different settings
src/integrationTest/java/com/mongodb/spark/sql/connector/write/MongoSparkConnectorWriteTest.java Adds integration test testIgnoreDuplicates() to verify end-to-end functionality with both _id-based and unique index-based duplicates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant