Skip to content

Conversation

@JSv4
Copy link
Collaborator

@JSv4 JSv4 commented Jan 20, 2025

Upgrades to pipelines:

  1. Added a post-processor type that's meant to be used on the export step on the assembled zip. Error handling is simplistic so this will currently choke on really large corpuses.
  2. Add an example post processor that redacts pdfs based on selected annotations.
  3. Add an input_schema class property to pipeline classes so we can clearly define class-specific inputs and potentially collect this via GUI.
  4. Implementing input_schema functionality just for post-processor (and pdf redactor specifically) atm so you can select which annotations to treat as redactions.
  5. Changing to a Copyleft license for now... particularly as I'm starting to add functions that are paid features in other tools/products. I want this to be a community asset, but I also don't want to be free R&D. Could be convinced to return to Apache2 with enough community engagement.

JSv4 and others added 29 commits January 8, 2025 00:30
Changing to GPLv3
Changing to GPLv3
…k is related to dynamic insertion AND exceution of post_processor. TRying to force refresh.
…s as that's not actually what that suite is testing and the only pipeline component we were doing that for was post_processor and encountering some kind of dynamic loading hell. Not worth all of this aggrevation.
@JSv4 JSv4 merged commit ed112e6 into main Jan 21, 2025
4 checks passed
@JSv4 JSv4 deleted the JSv4/add-post-processors-to-pipelines branch January 21, 2025 04:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants