Add strikethrough support #895
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds a simple strikethrough LLM processor that is capable of detecting
strikethrough text.
Fyi: I wrote tests, but was getting errors related to pulling the
datalab-to/pdfsdataset when running the LLM processor suite locally.Here's a test script with a sample PDF:
strikethrough.pdf.
Rendered PDF: https://gist.github.com/aud/2af33313f945a397b28e7ca728a85d8b