Skip to content

Resolve minor bugs in data loader core #2752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 12, 2025

Conversation

inv-jishnu
Copy link
Contributor

@inv-jishnu inv-jishnu commented Jun 10, 2025

Description

This PR addresses several bugs encountered while testing the CLI application for data import using transactions.

Incorrect success/failure count in transaction import

There was a mismatch in success and failure counts when an import failed within a transaction batch. Although all entries were initially marked as saved, if any transaction failed during the pre-commit check, their statuses were changed to aborted. However, the success/failure counts were calculated before this status update, leading to incorrect reporting.

Updated the logic to determine if any transaction in the batch failed:

  • If so, the full batch size is counted as failures.
  • Otherwise, it is counted as successes.

Missing namespace and table Names in column validation errors

Column validation error messages were showing null for the namespace and table name. This happened because ColumnInfo was being constructed without these values.

Updated ColumnInfo initialization to include namespace and table names. Also added the invalid value in the error message to improve clarity.

Failure and summary logs not generated on import failure

When an import failed during a transaction batch, failure and summary logs were not being created. This was caused by an attempt to update a singleton list used to store transaction batch results, which threw a java.lang.UnsupportedOperationException.

Changed the list structure to a modifiable list, allowing proper updates and enabling correct logging.

Related issues and/or PRs

NA

Changes made

  • Updated the logic to correctly calculate success and failure counts for each data chunk.
  • Added table and namespace name parameters to ensure they are not null and are properly included in error messages.
    • Also updated the error message to include the value that caused the error for better context.
    • Included the import source record in the failure log to provide additional information.
  • Changed the creation of the status object list to use an ArrayList instead of a singleton list, as the list may need to be modified later in case of specific errors.

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

NA

Release notes

Fixes for minor bugs and issues in data loader core

@inv-jishnu inv-jishnu self-assigned this Jun 10, 2025
@inv-jishnu inv-jishnu marked this pull request as draft June 10, 2025 12:12
@ypeckstadt ypeckstadt changed the title Fixes for minor bugs and issues in data loader core Resolve minor bugs in data loader core Jun 11, 2025
@ypeckstadt ypeckstadt marked this pull request as ready for review June 11, 2025 00:37
@ypeckstadt ypeckstadt requested review from Copilot, ypeckstadt, a team, komamitsu, brfrn169, feeblefakie and Torch3333 and removed request for a team June 11, 2025 00:37
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes transactional import counting, enriches column validation errors with detailed context, and ensures failure logs are generated correctly.

  • Adjusted success/failure counters to aggregate per record batch
  • Extended ColumnUtils to include namespace, table, and invalid value in error messages
  • Switched to modifiable lists in ImportTask to prevent UnsupportedOperationException and updated CoreError definitions

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
data-loader/core/src/test/java/com/scalar/db/dataloader/core/util/ColumnUtilsTest.java Updated tests to expect the invalid value in error messages and added a null-result case
data-loader/core/src/main/java/com/scalar/db/dataloader/core/util/ColumnUtils.java Extended methods to accept namespace and table parameters and populate ColumnInfo accordingly
data-loader/core/src/main/java/com/scalar/db/dataloader/core/dataimport/task/ImportTask.java Changed singleton list to ArrayList for targets and included the source record on validation failure
data-loader/core/src/main/java/com/scalar/db/dataloader/core/dataimport/processor/ImportProcessor.java Refactored batch counting logic to correctly sum successes and failures per batch
core/src/main/java/com/scalar/db/common/error/CoreError.java Enhanced error message templates to include the invalid value placeholder
Comments suppressed due to low confidence (3)

data-loader/core/src/main/java/com/scalar/db/dataloader/core/util/ColumnUtils.java:181

  • [nitpick] Rename the parameter 'table' to 'tableName' across ColumnUtils methods to align with the existing convention (columnInfo.getTableName()) and reduce ambiguity.
String table)

data-loader/core/src/main/java/com/scalar/db/dataloader/core/util/ColumnUtils.java:303

  • [nitpick] Similarly, rename the 'table' parameter in getColumnFromSourceRecord to 'tableName' for consistency with other method signatures and builder patterns.
String table)

data-loader/core/src/test/java/com/scalar/db/dataloader/core/util/ColumnUtilsTest.java:224

  • Consider adding assertions in this test to verify that each returned Column includes the correct namespace and tableName metadata when the result is null.
void getColumnsFromResult_withResultNull_withValidData_shouldReturnColumns()

@@ -689,13 +689,13 @@ public enum CoreError implements ScalarDbError {
DATA_LOADER_INVALID_BASE64_ENCODING_FOR_COLUMN_VALUE(
Category.USER_ERROR,
"0149",
"Invalid base64 encoding for blob value for column %s in table %s in namespace %s",
"Invalid base64 encoding for blob value '%s' for column %s in table %s in namespace %s",
Copy link
Preview

Copilot AI Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Add a trailing period to this error message string to match the punctuation style of other CoreError templates (e.g., end with a period).

Suggested change
"Invalid base64 encoding for blob value '%s' for column %s in table %s in namespace %s",
"Invalid base64 encoding for blob value '%s' for column %s in table %s in namespace %s.",

Copilot uses AI. Check for mistakes.

Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Contributor

@ypeckstadt ypeckstadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you.

@inv-jishnu inv-jishnu requested a review from Torch3333 June 12, 2025 03:34
Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@feeblefakie feeblefakie merged commit 5a45200 into master Jun 12, 2025
55 checks passed
@feeblefakie feeblefakie deleted the feat/data-loader/core-minor-fixes branch June 12, 2025 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants