Skip to content

Conversation

@yruslan
Copy link
Collaborator

@yruslan yruslan commented Oct 6, 2025

Closes #786

Summary by CodeRabbit

  • New Features
    • Processing now returns the total number of records processed, enabling programs to capture and use the count directly.
  • Documentation
    • Updated method documentation to clarify that the operation returns the number of records processed.
  • Tests
    • Expanded tests to validate the returned record count, ensuring accuracy without changing existing output behavior.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 6, 2025

Walkthrough

The processor API now returns a Long record count. CobolProcessor.process and StreamProcessor.processStream signatures changed from Unit to Long. StreamProcessor counts processed records and returns the total. Tests were updated to capture and assert the count.

Changes

Cohort / File(s) Summary of changes
Processor API return type
cobol-parser/src/main/scala/.../processor/CobolProcessor.scala
Changed process return type from Unit to Long; added doc noting it returns the number of processed records; minor formatting in pattern match.
Stream processing implementation
cobol-parser/src/main/scala/.../processor/impl/StreamProcessor.scala
Updated processStream to return Long; introduced local counter incremented per record; returns total count; added doc note.
Tests updated for new return
cobol-parser/src/test/scala/.../processor/CobolProcessorBuilderSuite.scala
Captures returned count from process; asserts count equals 4; existing output validations unchanged.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant CP as CobolProcessor
  participant SP as StreamProcessor
  participant OS as OutputStream

  Client->>CP: process(inputStream, outputStream)
  CP->>SP: processStream(..., outputStream): Long
  rect rgb(230,245,255)
    note right of SP: Loop per record
    loop For each record
      SP->>SP: extract + transform
      SP->>OS: write(record)
      SP->>SP: recordCount += 1
    end
  end
  SP-->>CP: return recordCount: Long
  CP-->>Client: return recordCount: Long
  note over Client,CP: Caller can use total processed count
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I counted bytes with twitching nose,
Four tidy records, neatly rose.
A thump of paws, a cheerful bound—
Now process returns the tally found.
Carrots tallied, outputs neat,
Hop, hop—metrics made complete! 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately summarizes the main change of modifying the Cobol processor to return the number of records processed, directly reflecting the key objective of the pull request. It is concise, clear, and specific enough that a reviewer scanning the history understands the purpose without extraneous details. Therefore, it meets the guidelines for an effective PR title.
Linked Issues Check ✅ Passed The implementation updates the public API method signatures and internal logic to return a Long count of processed records, adds documentation for the return value, and includes a test assertion verifying the record count. These modifications directly address the linked issue’s requirement to make the record processor return the number of records processed. Therefore, the pull request fulfills the coding objectives of issue #786.
Out of Scope Changes Check ✅ Passed All changes are focused on altering the processor method return types, implementing record counting, and updating tests and documentation to reflect this behavior. There are no modifications unrelated to the objective of returning the record count. Thus, no out-of-scope changes are present.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/786-processor-return-record-count

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a58fdc8 and 233c90a.

📒 Files selected for processing (3)
  • cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/processor/CobolProcessor.scala (3 hunks)
  • cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/processor/impl/StreamProcessor.scala (2 hunks)
  • cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/processor/CobolProcessorBuilderSuite.scala (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/processor/CobolProcessorBuilderSuite.scala (1)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/processor/CobolProcessor.scala (6)
  • build (53-76)
  • process (58-74)
  • processor (102-104)
  • processor (106-110)
  • processor (112-125)
  • processor (127-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Spark 3.4.4 on Scala 2.12.20
  • GitHub Check: Spark 3.5.5 on Scala 2.13.16
  • GitHub Check: Spark 2.4.8 on Scala 2.11.12
  • GitHub Check: Spark 3.5.5 on Scala 2.12.20
  • GitHub Check: test (2.12.20, 2.12, 3.3.4, 0, 80, 20)
🔇 Additional comments (5)
cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/processor/impl/StreamProcessor.scala (2)

37-37: LGTM!

Documentation clearly indicates the return value represents the number of records processed.


44-47: Record counting implementation is correct.

The counter is initialized, incremented for each record, and returned properly. Using Long is appropriate for handling large record counts. The counter is incremented at the start of each iteration, meaning it reflects the number of records attempted to be processed rather than successfully processed—if processRecord throws an exception mid-stream, the count will include that partial attempt before the exception propagates.

Consider clarifying in the documentation whether the count represents records "attempted" or "successfully processed" if this distinction matters for your use case.

Also applies to: 67-67

cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/processor/CobolProcessor.scala (2)

41-41: LGTM!

Documentation clearly describes the return value.


43-45: Breaking API change correctly implemented.

The trait and implementation signatures now return Long, and the return value from StreamProcessor.processStream is properly propagated. This is a breaking change for existing callers, but aligns with the PR objective to return the record count. Existing Scala code that ignores the return value will continue to work without modification.

Verify that downstream consumers of this API have been identified and can handle the signature change if they rely on the previous Unit return type.

Also applies to: 58-74

cobol-parser/src/test/scala/za/co/absa/cobrix/cobol/processor/CobolProcessorBuilderSuite.scala (1)

45-45: Test correctly verifies the record count feature.

The test now captures the return value and asserts that 4 records were processed, which matches the 4-byte input stream with 1-byte records (as defined by PIC X in the copybook). This properly validates the new API contract.

Also applies to: 49-49


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Oct 6, 2025

JaCoCo code coverage report - 'cobol-parser'

Overall Project 91.71% 🍏
Files changed 100% 🍏

File Coverage
CobolProcessor.scala 89.56% 🍏
StreamProcessor.scala 79.49% 🍏

@github-actions
Copy link

github-actions bot commented Oct 6, 2025

JaCoCo code coverage report - 'spark-cobol'

There is no coverage information present for the Files changed

Total Project Coverage 79.79% 🍏

@yruslan yruslan merged commit 9873a3d into master Oct 6, 2025
7 checks passed
@yruslan yruslan deleted the feature/786-processor-return-record-count branch October 6, 2025 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make record processor return record count processed.

2 participants