-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DSS-3207: review #179
base: develop
Are you sure you want to change the base?
DSS-3207: review #179
Conversation
…or pdfbox and openpdf
# Conflicts: # dss-pades-pdfbox/src/main/java/eu/europa/esig/dss/pdf/pdfbox/PdfBoxDocumentReader.java # dss-pades-pdfbox/src/main/java/eu/europa/esig/dss/pdf/pdfbox/PdfBoxSignatureService.java # dss-pades-pdfbox/src/test/java/eu/europa/esig/dss/pades/signature/visible/AbstractTestVisualComparator.java # dss-pades-pdfbox/src/test/java/eu/europa/esig/dss/pades/signature/visible/DefaultVsNativeDrawerComparatorTest.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @bsanchezb ,
I am very happy of the final result, thank you for taking care of it!
The extension of the PdfMemoryUsageSetting.Mode
is reasonable.
I have left just a couple of trivial observations in the code.
Looking forward to see this improvement into the next release.
Regards,
Andrea
String.format("Encrypted document : %s", e.getMessage())); | ||
MemoryUsageSetting memoryUsageSetting = PdfBoxUtils.getMemoryUsageSetting(pdfMemoryUsageSetting); | ||
try { | ||
if (PdfMemoryUsageSetting.Mode.MEMORY_FULL != pdfMemoryUsageSetting.getMode() && dssDocument instanceof FileDocument) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MEMORY_FULL could be used with a FileDocument (meaning that willingness is to fully load into memory the file) and I understand that you achieve that also with the stream and RandomAccessReadBuffer later in the code.
I just see it more complex and less linear, why not leave to the library? (that does the same anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is not really the same. See the comparison between "Memory Full" and "Memory Buffered" in #177. There is a consistent difference in execution, mainly on validation and LTA-level extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I haven't caught that well.
I am surprised, the code is exactly the same for pdfbox:
loadPDF(File file, String password, StreamCacheCreateFunction streamCacheCreateFunction)
--> Loader.loadPDF(file, password, null, null, streamCacheCreateFunction)
--> https://github.com/apache/pdfbox/blob/357a2c14762f3e08f4ef48852d79f541ccaca6af/pdfbox/src/main/java/org/apache/pdfbox/Loader.java#L223
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it uses the IOUtils.createMemoryOnlyStreamCache(), but creates a RandomAccessReadBufferedFile(file) instead of converting the document to binaries directly. I think it makes the difference. Possible that the file in filesystem is being accessed multiple number of times? Especially provided that there is a huge difference in openPdf implementation, I think it would be beneficial to keep the same logic in both implementations. Moreover, keeping the old behavior we reduce the possibility of regression, in case we miss something or PdfBox unexpectedly changes the handling inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, I agree with you. It's safer to keep the old behaviour and reduce possibility of regressions.
|
||
/** | ||
* Represents the PDF document loading setting. | ||
* NOTE: The setting is applicable only when an instance of {@code eu.europa.esig.dss.model.FileDocument} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not true, the setting is applied also with other types of Document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. It is indeed applicable in other cases too (mainly for PdfBox implementation),
// This condition uses a MappedByteBuffer to process the file in memory | ||
return new PdfReader(filenameSource, passwordProtection); | ||
case FILE: | ||
return new PdfReader(new RandomAccessFileOrArray(filenameSource, false, true), passwordProtection); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the RandomAccessFileOrArray
is declared as Closable, neither the openpdf is actually closing it.. instead shall we close it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I will check if closing it here won't brake the processing. Otherwise, I will do a similar logic with PdfBoxDocumentReader and RandomAccessRead handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a look in the code of OpenPdf and it looks like the RandomAccessFileOrArray
is actually being closed inside of PdfReader.close()
method. See that RandomAccessFileOrArray
is stored inside PRTokeniser
object (tokens
variable) https://github.com/LibrePDF/OpenPDF/blob/1.3.43/openpdf/src/main/java/com/lowagie/text/pdf/PdfReader.java#L284, which is closed in PdfReader.close()
https://github.com/LibrePDF/OpenPDF/blob/1.3.43/openpdf/src/main/java/com/lowagie/text/pdf/PdfReader.java#L3408 and https://github.com/LibrePDF/OpenPDF/blob/1.3.43/openpdf/src/main/java/com/lowagie/text/pdf/PRTokeniser.java#L182.
So I think in order to not complexify the code, I will keep it as it is.
Moreover, I will go even further and will align the PdfBox implementation accordingly, as RandomAccessReadBuffer
is actually being closed as well inside the PDDocument
. See https://github.com/apache/pdfbox/blob/3.0.3/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/PDDocument.java#L1272.
This branch is a continuation of PR #177 with a PDF memory usage setting introduction.