Skip to content

Conversation

@david-yz-liu
Copy link
Collaborator

Proposed Changes

(Describe your changes here. Also describe the motivation for your changes: what problem do they solve, or how do they improve the application or codebase? If this pull request fixes an open issue, use a keyword to link this pull request to the issue.)

This pull request improves the efficiency of the SplitPdfJob, used to process uploaded PDFs for scanned assessments.

Summary of changes:

  1. Using markus-exam-matcher==0.2.0, single-page PDF files are now processed in bulk rather than starting a separate Python subprocess for each page. PDF to JPEG conversion now happens in the Python markus-exam-matcher side, requiring a new system package poppler-utils.
  2. Database queries have been refactored to use bulk insert/upserts.
Screenshots of your changes (if applicable)
Associated documentation repository pull request (if applicable)

Type of Change

(Write an X or a brief description next to the type or types that best describe your changes.)

Type Applies?
🚨 Breaking change (fix or feature that would cause existing functionality to change)
New feature (non-breaking change that adds functionality)
🐛 Bug fix (non-breaking change that fixes an issue)
🎨 User interface change (change to user interface; provide screenshots)
♻️ Refactoring (internal change to codebase, without changing functionality)
🚦 Test update (change that only adds or modifies tests)
📦 Dependency update (change that updates a dependency)
🔧 Internal (change that only affects developers or continuous integration)

Checklist

(Complete each of the following items for your pull request. Indicate that you have completed an item by changing the [ ] into a [x] in the raw text, or by clicking on the checkbox in the rendered description on GitHub.)

Before opening your pull request:

  • I have performed a self-review of my changes.
    • Check that all changed files included in this pull request are intentional changes.
    • Check that all changes are relevant to the purpose of this pull request, as described above.
  • I have added tests for my changes, if applicable.
    • This is required for all bug fixes and new features.
  • I have updated the project documentation, if applicable.
    • This is required for new features.
  • If this is my first contribution, I have added myself to the list of contributors.

After opening your pull request:

  • I have updated the project Changelog (this is required for all changes).
  • I have verified that the pre-commit.ci checks have passed.
  • I have verified that the CI tests have passed.
  • I have reviewed the test coverage changes reported by Coveralls.
  • I have requested a review from a project maintainer.

Questions and Comments

(Include any questions or comments you have regarding your changes.)

@david-yz-liu david-yz-liu force-pushed the improve-split-pdf-job-efficiency branch 2 times, most recently from 960488f to 91e242d Compare August 5, 2025 20:44
@coveralls
Copy link
Collaborator

coveralls commented Aug 5, 2025

Pull Request Test Coverage Report for Build 16765202613

Details

  • 80 of 89 (89.89%) changed or added relevant lines in 5 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.003%) to 91.881%

Changes Missing Coverage Covered Lines Changed/Added Lines %
app/lib/repository.rb 1 2 50.0%
app/lib/git_repository.rb 1 3 33.33%
app/jobs/split_pdf_job.rb 75 81 92.59%
Totals Coverage Status
Change from base Build 16681564968: -0.003%
Covered Lines: 42074
Relevant Lines: 45089

💛 - Coveralls

@david-yz-liu david-yz-liu force-pushed the improve-split-pdf-job-efficiency branch from 91e242d to 8756b0e Compare August 6, 2025 00:38
Requires markus_exam_matcher==0.2.0 to be installed, and adds a new
system package dependency to poppler-utils.
@david-yz-liu david-yz-liu force-pushed the improve-split-pdf-job-efficiency branch from 8756b0e to 1c094b1 Compare August 6, 2025 01:26
@david-yz-liu david-yz-liu merged commit 626f879 into MarkUsProject:master Aug 6, 2025
4 of 6 checks passed
@david-yz-liu david-yz-liu deleted the improve-split-pdf-job-efficiency branch August 6, 2025 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants