-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF output size #848
Comments
@neodynamic Can you see what Skia produces? |
@charlesroddie no, and you? |
You could have a look at setting the quality down: https://docs.microsoft.com/en-us/dotnet/api/skiasharp.skdocumentpdfmetadata |
We've been reviewing this matter and we can conclude that the concerns about pdf output file size cannot be improved because the following... it seems that Skia (native lib) PDF backend design will embed any font file needed to render the text at the target device. That page states the following:
The sample here https://github.com/mono/SkiaSharp/blob/master/samples/Gallery/Shared/Samples/CreatePdfSample.cs will use the default font in the system, which under Windows, it's likely to be If no one here has more comments on this matter, then @mattleibow you can close this issue. |
You could use HarfBuzz's subsetting to reduce the font's size. Then load that font to produce the PDF. Sadly this isn't supported by HarfBuzzSharp. Yet.... |
Yes, that could be the only way to reduce pdf output file size... |
Looking at the skia code, it seems there is 2 subsetters built in. But, this is disabled because we are not building with either icu or harfbuzz/sfntly. However, there is a hook that makes subsetting work, but it is not a "public API". But, since it is fairly simple, we might be able to do something. The API hasn't changed much, so it might just be safe to do something. I'll have a look at what we can do. Can't promise anything as I haven't had a look at exactly how the PDF is constructed, but it seems to only write the fonts when the PDF is closed, so we could potentially add a argument there, or in the metadata in the constructor. They actually have an enum there that allows you to pick either harfbuzz or sfntly. Seems to be not too hard to add one for us, and then we can use any font subsetter. |
Started a thing on the skia bugtracker. I want to do this right: https://bugs.chromium.org/p/skia/issues/detail?id=10491 |
Hi @mattleibow. I've tried to set lower EncodingQuality and RasterDpi and they have no impact on the output file at all. It outputs the same file size and quality. Latest SkiaSharp on Windows 10. |
Any progress on this? Japanese/Chinese fonts are easily 10MB+ (per weight), so this becomes nigh unusable. |
Cant believe this is an issue. You should let the developer choose whether to embed the font file or not. |
@mattleibow I was able to build the Windows libSkiaSharp using Skia's support for Harfbuzz subsetting. It seems to work fine. My test PDF that was over 280 KB went down to less than 10 KB with the subsetting. Other than changing the Skia build switches, the only thing I had to do was edit Skia's Harfbuzz BUILD.gn since the forked version appears to be out of sync with the Harfbuzz commit in the DEPS. Can you think of any reason that this wouldn't be a viable solution? |
any updates regarding the file size/fonts? |
Should I assume this has been abandoned and rebuild my project using another PDF library? |
In case you go for a different library, don't use QuestPDF as it uses SkiaSharp under the hood and suffers from the same big file sizes. |
Thanks for the tip. I don't understand how something like this wouldn't be recognized as a fatal issue.
Like, what. |
Depends on the use case really. If you only generate a single pdf, no one cares for a 2 MB PDF on their PC. But I needed it for a production series for part protocol where I have a part every 2 seconds so every 2 seconds I need to save a pdf to network share to archive the part measure results. Then a 2 MB file every 2 seconds costs a hell lot of storage and that's just not going to work |
indeed, thousands of small files are processed. 1000 * 2MB (while it is normally like 159KB - 236KB really means a big difference, in network traffic, processing time, diskspace etc...) it is related to fonts, but should be investigated by skia... another reason: Most ISP mailboxes/corporate policies still have a mailbox email size limit of 10MB or 15MB. meaning 5 attachments vs. 15 - 20.... |
Even though I got the native Skia subsetting working with a custom build, I wasn't happy with it. It's a very naive implementation,, and doesn't perform well for larger (like CJK) fonts. It's better than nothing, but wasn't sufficient for my use. I ended up using a two-pass approach by building the font subsets before rendering and then passing those in to SkiaSharp. |
Is there a way to force Skia to render all text as paths? |
I'm trying to run pdf checker on a file generated with SkiaSharp, with no strings in it.
Optimizing it with 3rd party tools allowed to go from ~500kb to 100kb. Looks like there is something besides embedded fonts that could be optimized in Skia. My sample is generated with Svg.Skia and the source only consists of vector lines. I've no idea what can be so inefficient there. Trying to mess with SKDocumentPdfMetadata actually results in bigger file size. I would expect it to be a no-op, but if I supply any RasterDpi value or non-default EncodingQuality value, the file size jumps up another ~400kb. This doesn't make sense. |
Any updates for this? |
@jeffska : would you have a gist or some place where we could take a look at what you put in place to build the font subsets externally ? |
So, How is the progress? |
Wondering the same @TimLee88 |
images in PDFs don't seem to support 1bpp which increases also the pdf size, correct? |
Hi, has anything happened here? We are looking for a solution. But now the PDF files which had been between 20 and 40 KB are now over 500 KB big. So, is there someone workling on this issue or will this not be implemented at all? |
Have you inspected the PDF to make sure the SVG isn't just being rasterized? |
we've seen an increase because 1bpp images are not supported. Resulting in larger pdf's (every 1bpp image is converted to 24bpp). The 1bpp pdf's happen when multifunctional devices make scans.. Would like to see some support also for 1bpp... as this makes pdf's much much bigger. (especially when 1bpp glyph bitmaps are used) |
I don't care so much about performance or lib size and I already use HarfBuzzSharp for measuring text widths. |
It was yesterday I chose SkiaSharp for my back-end api server to create small simple, flowchart-like pages (svg for in-browser view, and pdf for printing of the same thing) My reason to choose SkiaSharp was that it can do both. Today it took me four hours to track down that it is not me doing anything so wrong that a one-page pdf with a single rectangle and a label in it is half a megabyte large, the same is below 1KB in svg. ...and then I just found out here that in the FIVE years since 2019 when this issue was first mentioned here, with no workaround, nothing happened, except that the issue was closed. I guess to solve it would be not much more complicated than putting in an extra if statement into the code. If this package was not 34MB code (zipped!) I might even had a look if I could try it ...Btw, is open source dying? |
I'd say that in general, ongoing economic consolidation and things like covid have probably made anything free or Free or "free" less sustainable. That said, this is a reasonably reasonably active repo, so I can only conclude that they just don't care about pdf output. |
one of the problems is that these are not supported: #848 (comment) Note: Microsoft is putting again effort on XPS after ±8 years. Might be an option to switch again to XPS files 🙌 |
Running this sample https://github.com/mono/SkiaSharp/blob/master/samples/Gallery/Shared/Samples/CreatePdfSample.cs which creates a simple two pages PDF file, the created file (under Windows) is about 510KB
Is there any compression setting to get the output PDF file size lighter? 510KB for a two pages PDF with a simple text seems to be somehow heavy... Any hints?
The text was updated successfully, but these errors were encountered: