-
Notifications
You must be signed in to change notification settings - Fork 653
Closed
Labels
Description
Please provide all mandatory information!
Describe the bug (mandatory)
When calling Document.setMetadata, some keys that are not the in dictionary are written the PDF document with the literal string value none. I think they should just be ommitted.
To Reproduce (mandatory)
>>> import fitz
>>> doc = fitz.open()
>>> doc.insertPage(-1)
0
>>> doc.PDFTrailer()
'<<\n /Size 3\n /Root 1 0 R\n>>'
>>> doc.save('test.pdf')
>>> doc.PDFTrailer()
'<<\n /Size 3\n /Root 1 0 R\n /ID [ <DD34F5DD4BD314AE963D741F5238F847> <AC1F320E75267644CDBC99CE6D094BFA> ]\n>>'
>>> doc.setMetadata({'title': 'Test'})
>>> doc.save('test.pdf')
>>> doc.PDFTrailer()
'<<\n /Size 3\n /Root 1 0 R\n /ID [ <DD34F5DD4BD314AE963D741F5238F847> <90970EF47F0CD252B8152FCABA5F3E03> ]\n /Info 5 0 R\n>>'
>>> doc.xrefObject(5)
'<<\n /Author (none)\n /CreationDate (none)\n /Creator (none)\n /Keywords (none)\n /ModDate (none)\n /Producer (none)\n /Subject (none)\n /Title (Test)\n>>'
Workaround
I worked around this problem for now with this method (because I had to roll out my code to production yesterday):
# pymupdf inserts stuff like '/Author (none)' when the author is not provided.
# This is wrong. We'll file a bug, but let's first fix it here.
def fixup_pymupdf_metadata(doc):
# Access to the Info xref is not in the API, so let's dig for it.
trailer_lines = outdoc.PDFTrailer().split('\n')
for line in trailer_lines:
if ' /Info ' in line:
s = line.replace(' /Info ', '')
info_xref = s[:s.find(' ')]
info_xref = int(info_xref)
s = doc.xrefObject(info_xref)
new_s = ''
for infoline in s.split('\n'):
if re.match('^.*\/[A-Za-z]+ \(none\)$', infoline):
continue
new_s += infoline + '\n'
doc.updateObject(info_xref, new_s)
break