Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TrOCR + VisionEncoderDecoderModel #13874

Merged
merged 38 commits into from
Oct 13, 2021
Merged
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
26b79a6
First draft
NielsRogge Sep 25, 2021
eebfafa
Update self-attention of RoBERTa as proposition
NielsRogge Sep 29, 2021
adf1cb3
Improve conversion script
NielsRogge Sep 30, 2021
be7ec13
Add TrOCR decoder-only model
NielsRogge Sep 30, 2021
1ec88d5
More improvements
NielsRogge Sep 30, 2021
7ded83b
Make forward pass with pretrained weights work
NielsRogge Sep 30, 2021
9b4189f
More improvements
NielsRogge Sep 30, 2021
9b6f68b
Some more improvements
NielsRogge Sep 30, 2021
1127064
More improvements
NielsRogge Sep 30, 2021
ac5440d
Make conversion work
NielsRogge Oct 3, 2021
6c5d947
Clean up print statements
NielsRogge Oct 4, 2021
b54e32e
Add documentation, processor
NielsRogge Oct 4, 2021
d47b5f1
Add test files
NielsRogge Oct 4, 2021
b1a85a6
Small improvements
NielsRogge Oct 4, 2021
76f3a66
Some more improvements
NielsRogge Oct 4, 2021
1d8ed6b
Make fix-copies, improve docs
NielsRogge Oct 4, 2021
2c4337e
Make all vision encoder decoder model tests pass
NielsRogge Oct 4, 2021
cc4eb2c
Make conversion script support other models
NielsRogge Oct 5, 2021
170f905
Update URL for OCR image
NielsRogge Oct 5, 2021
28bdf18
Update conversion script
NielsRogge Oct 5, 2021
890dd70
Fix style & quality
NielsRogge Oct 5, 2021
15f797d
Add support for the large-printed model
NielsRogge Oct 5, 2021
f490e3a
Fix some issues
NielsRogge Oct 6, 2021
2230eb0
Add print statement for debugging
NielsRogge Oct 6, 2021
f8ad61d
Add print statements for debugging
NielsRogge Oct 6, 2021
e5f6983
Make possible fix for sinusoidal embedding
NielsRogge Oct 6, 2021
643c21d
Further debugging
NielsRogge Oct 6, 2021
b7c5bf8
Potential fix v2
NielsRogge Oct 6, 2021
6c4435d
Add more print statements for debugging
NielsRogge Oct 6, 2021
1a6825f
Add more print statements for debugging
NielsRogge Oct 6, 2021
667b03c
Deubg more
NielsRogge Oct 6, 2021
bf49483
Comment out print statements
NielsRogge Oct 6, 2021
f0c8b59
Make conversion of large printed model possible, address review comments
NielsRogge Oct 8, 2021
6f1d7fa
Make it possible to convert the stage1 checkpoints
NielsRogge Oct 8, 2021
c38904b
Clean up code, apply suggestions from code review
NielsRogge Oct 8, 2021
6e6b947
Apply suggestions from code review, use Microsoft models in tests
NielsRogge Oct 11, 2021
b1fedab
Rename encoder_hidden_size to cross_attention_hidden_size
NielsRogge Oct 11, 2021
f3d9e94
Improve docs
NielsRogge Oct 12, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update URL for OCR image
  • Loading branch information
NielsRogge committed Oct 5, 2021
commit 170f905b5703d04082cee7b265f4766afb47afe4
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ def prepare_img(checkpoint_url):
# url = "https://fki.tic.heia-fr.ch/static/img/a01-122-02.jpg" #
# url = "https://fki.tic.heia-fr.ch/static/img/a01-122.jpg"
elif "printed" in checkpoint_url:
url = "https://rrc.cvc.uab.es/files/sample21.jpg"
url = "https://www.researchgate.net/profile/Dinh-Sang/publication/338099565/figure/fig8/AS:840413229350922@1577381536857/An-receipt-example-in-the-SROIE-2019-dataset_Q640.jpg"
im = Image.open(requests.get(url, stream=True).raw).convert("RGB")
return im

Expand Down