You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,19 +1,19 @@
1
1
# PDF Assembler
2
2
3
-
[](https://www.npmjs.com/package/pdfassembler)[](https://www.npmjs.com/package/pdfassembler)[](https://github.com/dschnelldavis/pdfassembler)
[](https://www.npmjs.com/package/pdfassembler)[](https://www.npmjs.com/package/pdfassembler)[](https://github.com/DevelopingMagic/pdf-assembler)
The missing piece to edit PDF files directly in the browser.
7
7
8
-
PDF Assembler Disassembles PDF files into editable JavaScript objects, then assembles them back into PDF files, ready to save, download, or open.
8
+
PDF Assembler disassembles PDF files into editable JavaScript objects, then assembles them back into PDF files, ready to save, download, or open.
9
9
10
10
## Overview
11
11
12
-
Actually PDF Assembler itself only does one thing — it assembles PDF files (hence the name). However, it uses Mozilla's terrific [pdf.js](https://mozilla.github.io/pdf.js/) library to disassemble PDFs into JavaScript objects. Those objects can then be modified, after which PDF Assembler can re-assemble them back into PDFs, to display, save, or download.
12
+
Actually PDF Assembler itself only does one thing — it assembles PDF files (hence the name). However, it uses Mozilla's terrific [pdf.js](https://mozilla.github.io/pdf.js/) library to disassemble PDFs into editable JavaScript objects, which PDF Assembler can then re-assemble back into PDF files to display, save, or download.
13
13
14
14
### Scope and future development
15
15
16
-
PDF is a complex format (the [ISO standard describing it](https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf) is 756 pages long). So PDF Assembler makes working with PDFs (somewhat) simpler by separating the physical structure of a PDF from its logical structure. In the future, PDF Assembler will likely offer better defaults for generating PDFs, such as cross-reference streams and compressing objects, as well as more options, such as linearizing or encrypting the output PDF. However, editing features—like adding or editing pages, or even centering or wrapping text—are outside the scope of this library.
16
+
PDF is a complex format (the [ISO standard describing it](https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf) is 756 pages long). So PDF Assembler makes working with PDFs (somewhat) simpler by separating the physical structure of a PDF from its logical structure. In the future, PDF Assembler will likely offer better defaults for generating PDFs, such as cross-reference streams and compressing objects, as well as more options, such as to linearize or encrypt the output PDF. However, anything unrelated to the physical structure—like adding or editing pages, or even centering or wrapping text—will need to be done by the calling application or another library.
17
17
18
18
### Alternatives
19
19
@@ -23,7 +23,7 @@ If you want to simplify editing existing PDFs on a server, you can use command l
23
23
24
24
If you want to simplify editing existing PDFs in a browser, I haven't found that library yet. This library helps, but still requires a good understanding of how the logical structure of a PDF works.
25
25
26
-
If you want to learn more about logical structure of PDFs, I recommend O'Reilly's [PDF Explained](http://shop.oreilly.com/product/0636920021483.do). If you use this library, pdf.js and PDF Assembler will take care of reading and writing the raw bytes of the PDF, so you can skip to Chapter 4, "Document Structure":
26
+
To learn more about logical structure of PDFs, I recommend O'Reilly's [PDF Explained](http://shop.oreilly.com/product/0636920021483.do). If you use this library, pdf.js and PDF Assembler will take care of reading and writing the raw bytes of the PDF, so you can skip to Chapter 4, "Document Structure".
27
27
28
28

29
29
@@ -76,11 +76,11 @@ const helloWorldPdf = {
76
76
}
77
77
```
78
78
79
-
In this object, the main document catalog dictionary is '/Root' (and if there were a document information dictionary, it would be '/Info', because '/Root' and '/Info' are the names used to refer to these objects in the PDF trailer dictionary).
79
+
In this object, the main document catalog dictionary is '/Root'. Optionally, a more complex pdf might also have a document information dictionary, '/Info', as well as many other pdf objects.
80
80
81
-
There are a few small differences from a true PDF structure. For example, streams are _inside_ their dictionary objects in order to keep them together, even though in the final PDF they will be rendered immediately after their dictionaries instead.
81
+
There are a few small differences from a true PDF structure. For example, streams are _inside_ their dictionary objects in order to keep them together, even though in the final PDF they will be rendered immediately after their dictionaries.
82
82
83
-
Also, structure objects do not need to include stream '/Length' or page '/Parent' entries, because those entries will be automatically calculated and added when the PDF is assembled. (Adding them won't hurt anything, but there is no reason to, as they will just be overwritten.)
83
+
Also, structure objects do not need to include stream '/Length' or page '/Parent' entries, because those entries will be automatically added when the PDF is assembled. (Adding them won't hurt anything, but there is no reason to, as they will just be recalculated and overwritten when the PDF is assembled.)
84
84
85
85
### Re-using shared dictionary items
86
86
@@ -117,15 +117,15 @@ So, if you're not scared off yet, and still want to use PDF Assembler in your pr
117
117
npm install pdfassembler
118
118
```
119
119
120
-
Next, import pdfassembler in your project, like so:
120
+
Next, import PDF Assembler in your project, like so:
To us PDF Assembler, you must create a new PDFAssembler instance and initialize it, either with your own PDF structure object:
128
+
To us PDF Assembler, you must create a new PDF Assembler instance and initialize it, either with your own PDF structure object:
129
129
```javascript
130
130
// helloWorldPdf = the pdf object defined above
131
131
constnewPdf=newPDFAssembler(helloWorldPdf);
@@ -139,7 +139,7 @@ const newPdf = new PDFAssembler(binaryPDF);
139
139
140
140
### Editing the PDF object
141
141
142
-
After you've created a new new PDFAssembler instance, you can request a promise with the PDF structure object, and then edit it.
142
+
After you've created a new new PDF Assembler instance, you can request a promise with the PDF structure object, and then edit it.
143
143
(Some of PDF Assembler's actions are asynchronous, so it's necessary to use a promise to make sure the PDF is fully loaded before you edit it.)
144
144
145
145
For example, here is how to edit a PDF to remove all but the first page:
@@ -153,13 +153,13 @@ newPdf
153
153
154
154
### Problems with outlines and internal references
155
155
156
-
PDF Assembler does a good job managing page contents, and will automatically discard unused contents from deleted pages, while still retaining any contents used on other pages. However, if a PDF contains an outline or internal references that refer to a deleted page, those will cause errors in the assembled PDF file. (The PDF may still open and display, but the PDF reader will probably show an error message.) As a somewhat crude (and hopefully temporary) solution for this, PDF Assembler provides a function for removing all non-printable data from the root catalog, like so:
156
+
PDF Assembler does a good job managing page contents, and will automatically discard unused contents from deleted pages, while still retaining any contents used on other pages. However, if a PDF contains an outline or internal references that refer to a deleted page, those will cause errors in the assembled PDF file. (The PDF may still open and display, but probably with an error message.) As a somewhat crude (and hopefully temporary) solution for this, PDF Assembler provides a function for removing all non-printable data from the root catalog, like so:
157
157
158
158
```javascript
159
159
newPdf.removeRootEntries();
160
160
```
161
161
162
-
The trade-off is that after running removeRootEntries(), your assembled PDF is less likely to have errors, and may also be smaller in size, but will also not have any outline or other non-printing information available in the original PDF.
162
+
The trade-off is that after running removeRootEntries(), your assembled PDF is less likely to have errors (and may also be smaller in size), but it will no longer have an outline or any other non-printing information from the original PDF.
163
163
164
164
### Assembling a new PDF file from the the PDF structure object
165
165
@@ -178,7 +178,7 @@ newPdf
178
178
179
179
### PDF Assembler options
180
180
181
-
PDF Assembler has a few options that will change its behavior. All options can be set any time after you have created a new PDFAssembler instance and before you have assembled your final pdf, like so:
181
+
PDF Assembler has a few additional options that will change its behavior, primarily for debugging. After you have created a PDF Assembler instance, you can set these options like so:
0 commit comments