Skip to content

Conversation

@yruslan
Copy link
Collaborator

@yruslan yruslan commented Sep 10, 2025

Closes #782

Summary by CodeRabbit

  • New Features
    • Added support for COBOL DISPLAY numeric encoding (sign-separated and overpunch, optional explicit decimal point).
  • Bug Fixes
    • Excluded REDEFINES fields from record combination to avoid incorrect validation/encoding.
  • Documentation
    • README updated to list DISPLAY as an allowed PIC S9(n) numeric type.
  • Tests
    • Added comprehensive DISPLAY encoding tests, writer output/round-trip checks, null-input cases, and improved byte-array comparisons.
  • Chores
    • Updated Spark and test dependency versions and cleaned build properties.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 10, 2025

Walkthrough

Adds DISPLAY numeric encoding support for the EBCDIC writer via a new DisplayEncoders implementation and EncoderSelector wiring, test coverage (unit + writer integration), a REDEFINES filter in BasicRecordCombiner, test-utils hex formatting tweak, dependency version bumps, and a README docs update.

Changes

Cohort / File(s) Summary of changes
Display encoders implementation
cobol-parser/src/main/scala/.../encoding/DisplayEncoders.scala
New DisplayEncoders object providing encoding entry points for DISPLAY numerics (sign-separate and sign-overpunched), input validation, scaling/rounding, and EBCDIC byte population helpers.
Encoder selection wiring
cobol-parser/src/main/scala/.../encoding/EncoderSelector.scala
Adds getDisplayEncoder(...) and routes DISPLAY numeric cases (integral/decimal with no compact) to the new display encoders, handling sign position and explicit-decimal-point flags.
Parser / encoder tests
cobol-parser/src/test/scala/.../encoding/DisplayEncodersSuite.scala, .../BCDNumberEncodersSuite.scala, .../BinaryEncodersSuite.scala
New DisplayEncodersSuite; added null-input tests for BCD and binary encoders; import adjustments. Tests validate various sign formats, scales, overflows, and null handling.
Writer integration + test helpers
spark-cobol/src/test/scala/.../writer/FixedLengthEbcdicWriterSuite.scala
Adds DISPLAY-focused writer test that writes DataFrame rows with DISPLAY fields and asserts exact EBCDIC output; enables TextComparisonFixture; introduces assertArraysEqual helper for byte comparisons.
Record combiner logic
spark-cobol/src/main/scala/.../writer/BasicRecordCombiner.scala
Added an early filter in combine() to drop AST children that have REDEFINES before schema validation/encoding.
Test-utils formatting
cobol-parser/src/test/scala/.../testutils/ComparisonUtils.scala
Changed assertion-failure hex formatting to 0x-prefixed uppercase two-digit hex values joined by ", " (presentation-only change).
Build / dependencies
pom.xml, project/Dependencies.scala
Bumped ScalaTest and Spark versions; removed explicit maven.compiler.source/target; adjusted dependency version variables/formatting.
Docs
README.md
Documentation update in EBCDIC Writer "Current Limitations": PIC S9(n) numeric types now list DISPLAY as allowed.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Writer as EBCDIC Writer
  participant Combiner as BasicRecordCombiner
  participant Selector as EncoderSelector
  participant Display as DisplayEncoders

  Writer->>Combiner: combine(copybookFields)
  note right of Combiner #E8F3F8: New early filter drops fields with REDEFINES
  Combiner-->>Writer: filteredFields

  loop per field
    Writer->>Selector: getEncoder(field metadata)
    alt DISPLAY numeric (compact empty)
      Selector->>Display: encodeDisplayNumber*(value, precision, scale, signPosition, explicitDecimal, isSignSeparate)
      Display-->>Selector: Array[Byte]
      Selector-->>Writer: Encoder wrapping byte output
    else other formats
      Selector-->>Writer: other Encoder
    end
    Writer->>Writer: serialize value using Encoder -> bytes
  end

  Writer-->>User: write part file (bytes)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Pre-merge checks (4 passed, 1 warning)

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning While most additions implement DISPLAY numeric encoding, the PR also includes unrelated updates such as global version bumps in pom.xml and Dependencies.scala, assertion-formatting changes in ComparisonUtils, null-input tests for BCDNumberEncoders and BinaryEncodersSuite, and a BasicRecordCombiner redefines filter, all of which fall outside the linked issue’s scope of adding DISPLAY support in the EBCDIC writer. Move non-DISPLAY features such as version bumps, test-utility formatting changes, null-input tests for other encoders, and the BasicRecordCombiner redefines filter into separate PRs or issues so that this PR focuses solely on adding DISPLAY format support for the EBCDIC writer.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The current title concisely describes the primary change by indicating that DISPLAY type support is being added to the Spark-COBOL writer, directly matching the linked issue’s objective and avoiding unnecessary detail.
Linked Issues Check ✅ Passed The changes introduce and wire the new display numeric encoder (DisplayEncoders and getDisplayEncoder) into both parser and Spark-COBOL writer, add end-to-end tests for DISPLAY fields in the writer suite, and update documentation to reflect DISPLAY support, fully satisfying the linked issue’s objective.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Poem

I nibble bytes with gentle paws,
Display digits learn the laws.
Overpunch hugs the leading place,
Separate signs find their space.
REDEFINES hop out — hooray! — and bytes fall into place. 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13445ff and e4b1633.

📒 Files selected for processing (1)
  • spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Spark 3.4.4 on Scala 2.12.20
  • GitHub Check: Spark 3.5.5 on Scala 2.13.16
  • GitHub Check: Spark 3.5.5 on Scala 2.12.20
  • GitHub Check: Spark 2.4.8 on Scala 2.11.12
  • GitHub Check: test (2.12.20, 2.12, 3.3.4, 0, 80, 20)
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/782-add-display-type-for-writer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Sep 10, 2025

JaCoCo code coverage report - 'cobol-parser'

Overall Project 91.71% -0.01% 🍏
Files changed 98.37% 🍏

File Coverage
DisplayEncoders.scala 98.16% -1.84% 🍏
EncoderSelector.scala 90.26% 🍏

@github-actions
Copy link

github-actions bot commented Sep 10, 2025

JaCoCo code coverage report - 'spark-cobol'

File Coverage [58.95%]
BasicRecordCombiner.scala 58.95%
Total Project Coverage 79.79% 🍏

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pom.xml (1)

101-113: Restore missing compiler source/target properties

The Maven Compiler Plugin in pom.xml (lines 271–272) still references ${maven.compiler.source} and ${maven.compiler.target}, which are undefined and will break the build. Re-add them in the <properties> section, for example:

   <properties>
+    <maven.compiler.source>1.8</maven.compiler.source>
+    <maven.compiler.target>1.8</maven.compiler.target>
     <maven.scalatest.version>2.2.0</maven.scalatest.version>
     <!-- … -->
   </properties>
♻️ Duplicate comments (1)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (1)

547-550: Verify the validity of another edge case test combination.

Similar to the previous comment, this test combines scale = 1 with scaleFactor = 2 and explicitDecimalPoint = true. The expected result is all zeros, which suggests this combination is invalid.

🧹 Nitpick comments (9)
README.md (1)

1689-1691: Clarify DISPLAY support modes (overpunched vs separate sign).

Since writer now handles DISPLAY, consider adding a brief note that both sign overpunching and separate leading/trailing sign are supported to set user expectations.

-  - `PIC S9(n)` numeric (integral and decimal) with `DISPLAY`, `COMP`/`COMP-4`/`COMP-5` (big-endian), `COMP-3`, and 
+  - `PIC S9(n)` numeric (integral and decimal) with `DISPLAY` (overpunched sign and leading/trailing separate sign), `COMP`/`COMP-4`/`COMP-5` (big-endian), `COMP-3`, and
spark-cobol/src/main/scala/za/co/absa/cobrix/spark/cobol/writer/BasicRecordCombiner.scala (2)

34-37: Good: early drop of REDEFINES from the writer path.

This aligns with current writer limitations and unblocks copybooks containing REDEFINES.

To simplify types and avoid later casts, filter to primitives here:

-    val copybookFields = ast.children.filter {
-      case f if f.redefines.nonEmpty => false
-      case p: Primitive              => !p.isFiller
-      case g: Group                  => !g.isFiller
-      case _                         => true
-    }
+    val copybookFields = ast.children.collect {
+      case p: Primitive if p.redefines.isEmpty && !p.isFiller => p
+    }

…and adjust downstream uses to work with Seq[Primitive].


43-44: Locale-stable lowercasing.

Use Locale.ROOT to avoid surprises on non-English locales.

-    val sparkFields = df.schema.fields.map(_.name.toLowerCase)
+    val sparkFields = df.schema.fields.map(_.name.toLowerCase(java.util.Locale.ROOT))
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (2)

262-281: Nit: variable name doesn't match content.

copybookContentsWithBinFields contains DISPLAY definitions. Rename for clarity.

-        val copybookContentsWithBinFields =
+        val copybookContentsWithDisplayFields =

…and update the usage below.


407-415: Reuse shared array comparison to avoid duplication.

You already have ComparisonUtils in cobol-parser tests; if feasible, depend on that test util or move it to a shared test module.

-  def assertArraysEqual(actual: Array[Byte], expected: Array[Byte]): Assertion = {
+  def assertArraysEqual(actual: Array[Byte], expected: Array[Byte]): Assertion = {
     if (!actual.sameElements(expected)) {
       val actualHex = actual.map(b => f"0x$b%02X").mkString(", ")
       val expectedHex = expected.map(b => f"0x$b%02X").mkString(", ")
       fail(s"Actual:   $actualHex\nExpected: $expectedHex")
     } else {
       succeed
     }
   }

Alternatively, import and use the existing ComparisonUtils.assertArraysEqual if test scope dependency boundaries allow.

cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncoders.scala (4)

48-54: Prefer toPlainString to avoid scientific notation surprises.

BigDecimal.toString may use exponent form in some edge cases. toPlainString is safer for fixed-field encoding.

-      val bigDecimalValue1 = bigDecimal.toString
+      val bigDecimalValue1 = bigDecimal.toPlainString

Also applies to: 104-110


55-59: Consider enforcing precision against digit count, not just outputSize.

Currently only field width is checked. If precision < number of digits (ignoring sign/decimal separator), we should reject/zero. Add a digit-count check after computing bigDecimalValue/intValue.

If desired, I can draft a minimal patch and tests to enforce this.

Also applies to: 111-115, 126-130, 70-76


137-175: Minor: extract EBCDIC constants to improve readability.

Avoid magic numbers (0x40, 0xF0, 0x4B, 0x6B, 0x60, zone nibbles) with named vals at the top of the object.

I can provide a small refactor if you want this tidied up.

Also applies to: 197-215


40-76: Reduce duplication: factor number-normalization into a helper.

Both encoders duplicate the normalize/scale/length-check logic. A shared private method returning the normalized string (and a boolean overflow flag) would simplify maintenance.

Happy to sketch this helper if you’d like.

Also applies to: 96-133

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 19f8600 and a328804.

📒 Files selected for processing (9)
  • README.md (1 hunks)
  • cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncoders.scala (1 hunks)
  • cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/EncoderSelector.scala (3 hunks)
  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (1 hunks)
  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/testutils/ComparisonUtils.scala (1 hunks)
  • pom.xml (2 hunks)
  • project/Dependencies.scala (2 hunks)
  • spark-cobol/src/main/scala/za/co/absa/cobrix/spark/cobol/writer/BasicRecordCombiner.scala (1 hunks)
  • spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncoders.scala (1)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/position/Position.scala (1)
  • Right (26-26)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (3)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/testutils/ComparisonUtils.scala (2)
  • ComparisonUtils (22-33)
  • assertArraysEqual (23-31)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncoders.scala (3)
  • DisplayEncoders (23-216)
  • encodeDisplayNumberSignOverpunched (81-135)
  • encodeDisplayNumberSignSeparate (24-79)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/position/Position.scala (2)
  • Left (23-23)
  • Right (26-26)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/EncoderSelector.scala (2)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/decoders/BinaryUtils.scala (3)
  • cobol (165-209)
  • BinaryUtils (25-248)
  • getBytesCount (98-126)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncoders.scala (3)
  • DisplayEncoders (23-216)
  • encodeDisplayNumberSignSeparate (24-79)
  • encodeDisplayNumberSignOverpunched (81-135)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (1)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/source/fixtures/BinaryFileFixture.scala (1)
  • withTempDirectory (71-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Spark 3.4.4 on Scala 2.12.20
  • GitHub Check: Spark 3.5.5 on Scala 2.12.20
  • GitHub Check: Spark 2.4.8 on Scala 2.11.12
  • GitHub Check: Spark 3.5.5 on Scala 2.13.16
  • GitHub Check: test (2.12.20, 2.12, 3.3.4, 0, 80, 20)
🔇 Additional comments (16)
pom.xml (2)

111-113: Align Spark/ScalaTest versions across SBT and Maven.

Spark 3.5.6 and ScalaTest 3.2.19 match project/Dependencies.scala. No action needed beyond the compiler fix.


101-101: Approve ScalaTest Maven Plugin bump

project/Dependencies.scala (3)

26-26: ScalaTest 3.2.19 update is fine.

Matches the Maven property and new tests.


31-31: Spark 3.5.6 fallback for Scala 2.13 is consistent with pom.xml.

Good alignment.


90-90: Formatting-only change acknowledged.

No semantic impact.

spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3)

21-21: Import of Assertion is correct.

Enables helper to return Assertion cleanly.


283-291: Deterministic write path is OK.

coalesce(1) + orderBy("A") ensures stable byte order for assertions.


305-331: Nice: explicit byte-for-byte assertions for DISPLAY.

Covers signed/unsigned, explicit decimal point, and separate signs.

Consider adding one case for overpunched positive and negative to guard regressions in overpunch tables.

cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/testutils/ComparisonUtils.scala (1)

25-27: LGTM! Improved hex formatting for better readability.

The updated formatting with 0x prefix and uppercase hex values makes the assertion failure messages more professional and easier to parse during debugging.

cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (3)

19-23: LGTM! Well-structured test suite setup.

The test suite properly imports the necessary dependencies and follows Scala testing best practices using AnyWordSpec.


24-109: Comprehensive test coverage for sign-overpunched integral encoding.

The test cases thoroughly cover various scenarios including positive/negative numbers, signed/unsigned variants, edge cases (too big numbers, negative scale), and error conditions. The expected byte arrays are correctly defined for EBCDIC encoding.


260-263: Document or handle the scaleFactor > 0 && scale > 0 edge‐case
The code explicitly returns all zeros when both scaleFactor and scale are positive, but there’s no Scaladoc or comments explaining that combination. Please either add documentation for this behavior in encodeDisplayNumberSignOverpunched or adjust the test to a fully supported scenario.

cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/EncoderSelector.scala (4)

22-22: LGTM! Clean import addition.

The Position import is properly added to support the new display encoder functionality.


52-55: LGTM! Well-integrated display encoder support.

The new pattern matching cases for integral and decimal display types are cleanly integrated into the existing encoder selection logic.


148-172: Well-structured display encoder implementation.

The getDisplayEncoder method follows the existing pattern of other encoder methods in the file, properly handles input normalization, and correctly delegates to the appropriate DisplayEncoders method based on the sign separation flag.


158-159: No action required: nulls forwarded to DisplayEncoders produce zero-filled arrays by design
Passing null here is consistent—DisplayEncoders return zero-filled arrays for nulls, per Cobrix’s design choice to encode nulls as 0x00 for round-trip fidelity.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3)

275-284: Rename variable to reflect DISPLAY, not BIN

The copybook here is for DISPLAY fields. Rename for clarity.

-        val copybookContentsWithBinFields =
+        val copybookContentsWithDisplayFields =
@@
-          .option("copybook_contents", copybookContentsWithBinFields)
+          .option("copybook_contents", copybookContentsWithDisplayFields)
@@
-          .option("copybook_contents", copybookContentsWithBinFields)
+          .option("copybook_contents", copybookContentsWithDisplayFields)

Also applies to: 291-292, 342-343


303-307: Close stream on failure (use try/finally or Using)

Avoid potential leaks if readFully throws.

-        val data = fs.open(partFile)
-        val bytes = new Array[Byte](files.head.getLen.toInt)
-        data.readFully(bytes)
-        data.close()
+        val bytes = {
+          val in = fs.open(partFile)
+          try {
+            val arr = new Array[Byte](files.head.getLen.toInt)
+            in.readFully(arr)
+            arr
+          } finally {
+            in.close()
+          }
+        }

454-462: Hex dump prints signed bytes incorrectly; include index/length in failure

For bytes ≥ 0x80, %02X on a signed Byte can render extended hex. Mask with & 0xFF and surface the first mismatch and lengths.

-  def assertArraysEqual(actual: Array[Byte], expected: Array[Byte]): Assertion = {
-    if (!actual.sameElements(expected)) {
-      val actualHex = actual.map(b => f"0x$b%02X").mkString(", ")
-      val expectedHex = expected.map(b => f"0x$b%02X").mkString(", ")
-      fail(s"Actual:   $actualHex\nExpected: $expectedHex")
-    } else {
-      succeed
-    }
-  }
+  def assertArraysEqual(actual: Array[Byte], expected: Array[Byte]): Assertion = {
+    if (!actual.sameElements(expected)) {
+      val mismatchIdx =
+        actual.zipAll(expected, 0.toByte, 0.toByte).indexWhere { case (a, b) => a != b }
+      val actualHex   = actual.map(b => f"0x${b & 0xFF}%02X").mkString(", ")
+      val expectedHex = expected.map(b => f"0x${b & 0xFF}%02X").mkString(", ")
+      val lenMsg = s"lengths: actual=${actual.length}, expected=${expected.length}"
+      val idxMsg = if (mismatchIdx >= 0) s"first mismatch at index $mismatchIdx" else "length mismatch"
+      fail(s"$idxMsg; $lenMsg\nActual:   $actualHex\nExpected: $expectedHex")
+    } else {
+      succeed
+    }
+  }
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BCDNumberEncodersSuite.scala (1)

20-20: Remove unused import

The Left import from za.co.absa.cobrix.cobol.parser.position.Left is not used in this test file.

-import za.co.absa.cobrix.cobol.parser.position.Left
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 236b555 and 13445ff.

📒 Files selected for processing (4)
  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BCDNumberEncodersSuite.scala (2 hunks)
  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BinaryEncodersSuite.scala (1 hunks)
  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (1 hunks)
  • spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-19T09:12:31.865Z
Learnt from: yruslan
PR: AbsaOSS/cobrix#775
File: spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala:0-0
Timestamp: 2025-08-19T09:12:31.865Z
Learning: In Cobrix COBOL writer, null values are encoded as zero bytes (0x00) for all field types, including PIC X fields, rather than using traditional COBOL padding (EBCDIC spaces 0x40 for PIC X). This design decision prioritizes consistency with the reader side, which already treats zero bytes as nulls, ensuring round-trip fidelity between reading and writing operations.

Applied to files:

  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BinaryEncodersSuite.scala
🧬 Code graph analysis (4)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BinaryEncodersSuite.scala (2)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/BinaryEncoders.scala (2)
  • BinaryEncoders (21-71)
  • encodeBinaryNumber (22-70)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/testutils/ComparisonUtils.scala (1)
  • assertArraysEqual (23-31)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BCDNumberEncodersSuite.scala (3)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/position/Position.scala (1)
  • Left (23-23)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/testutils/ComparisonUtils.scala (2)
  • ComparisonUtils (22-33)
  • assertArraysEqual (23-31)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/BCDNumberEncoders.scala (2)
  • BCDNumberEncoders (21-108)
  • encodeBCDNumber (38-106)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (3)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/testutils/ComparisonUtils.scala (2)
  • ComparisonUtils (22-33)
  • assertArraysEqual (23-31)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncoders.scala (3)
  • DisplayEncoders (23-212)
  • encodeDisplayNumberSignOverpunched (82-136)
  • encodeDisplayNumberSignSeparate (24-80)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/position/Position.scala (2)
  • Left (23-23)
  • Right (26-26)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3)
spark-cobol/src/main/scala/za/co/absa/cobrix/spark/cobol/utils/SparkUtils.scala (2)
  • SparkUtils (37-554)
  • convertDataFrameToPrettyJSON (432-442)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/source/fixtures/BinaryFileFixture.scala (1)
  • withTempDirectory (71-78)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/source/fixtures/TextComparisonFixture.scala (1)
  • compareText (24-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Spark 3.5.5 on Scala 2.12.20
  • GitHub Check: Spark 2.4.8 on Scala 2.11.12
  • GitHub Check: Spark 3.4.4 on Scala 2.12.20
  • GitHub Check: Spark 3.5.5 on Scala 2.13.16
  • GitHub Check: test (2.12.20, 2.12, 3.3.4, 0, 80, 20)
🔇 Additional comments (16)
spark-cobol/src/test/scala/za/co/absa/cobrix/spark/cobol/writer/FixedLengthEbcdicWriterSuite.scala (3)

21-27: LGTM: Test mixins and imports are appropriate

Bringing in Assertion and TextComparisonFixture is correct, and the updated suite signature aligns with the new JSON comparison flow.


309-337: DISPLAY expectations look correct (overpunch + separate signs)

Encodings for S9 overpunch (e.g., C/J/{/}) and separate signs (0x4E '+', 0x60 '-') match EBCDIC conventions, including decimal point 0x4B.


339-377: Round-trip JSON check is sound

Ordering by A before write/read ensures determinism for the comparison. Using TextComparisonFixture keeps this robust across newline differences.

cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BCDNumberEncodersSuite.scala (2)

26-31: Good test coverage for null input handling

The addition of null input test cases for both integral and decimal numbers ensures robust handling of edge cases, consistent with the broader null-input coverage pattern across encoder test suites.


144-149: LGTM! Consistent null handling for decimal numbers

The null test case for decimal numbers follows the same pattern as integral numbers, maintaining consistency in the test suite.

cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/BinaryEncodersSuite.scala (1)

24-29: Well-structured null input test

Excellent addition of the null input test case. It follows the established pattern of returning zero bytes for null inputs, which aligns with the Cobrix design decision to encode null values as zero bytes for consistency between reader and writer operations.

cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/parser/encoding/DisplayEncodersSuite.scala (10)

1-22: LGTM! Proper test file structure and imports

The test suite is well-structured with appropriate imports for testing the new DisplayEncoders functionality.


26-31: Good null handling test coverage

The null input test correctly validates that null values return zero-filled byte arrays for the sign overpunched encoding method.


33-38: Correct EBCDIC encoding for positive signed numbers

The test correctly validates the encoding of 12345 with overpunched positive sign (0xC5 = positive sign overpunched on '5').


54-59: Proper unsigned number encoding

The unsigned number test correctly uses signPosition = None and produces standard EBCDIC digits without sign overpunching.


61-66: Correct negative sign overpunching

The test properly validates negative number encoding with 0xD5 (negative sign overpunched on '5').


141-144: Proper explicit decimal point encoding

The test correctly validates that 0x4B represents the EBCDIC decimal point character.


341-346: Correct sign-separate positive encoding

The test properly validates the EBCDIC positive sign character (0x4E) in sign-separate format.


362-367: Well-tested unsigned number in sign-separate format

The unsigned number test correctly uses signPosition = None and produces standard EBCDIC digits without any sign character.


369-374: Proper negative sign-separate encoding

The test correctly validates the EBCDIC negative sign character (0x60) in sign-separate format.


23-641: Comprehensive test coverage for DISPLAY encoders

The test suite provides excellent coverage for both encodeDisplayNumberSignOverpunched and encodeDisplayNumberSignSeparate methods, including:

  • Null input handling
  • Various numeric values (positive, negative, zero, small, large)
  • Sign positions (Left, Right, None)
  • Explicit decimal point handling
  • Scale factors (positive and negative)
  • Edge cases (too big, too precise, incorrect precision)
  • Both integral and decimal numbers

The test expectations align correctly with EBCDIC encoding standards.

@yruslan yruslan merged commit 0731745 into master Sep 10, 2025
6 of 7 checks passed
@yruslan yruslan deleted the feature/782-add-display-type-for-writer branch September 10, 2025 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for numeric types having DISPLAY format in the EBCDIC writer

2 participants