Skip to content

Commit

Permalink
refactor generate, support for chrome-headless
Browse files Browse the repository at this point in the history
  • Loading branch information
gutschilla committed Apr 16, 2019
1 parent f212193 commit a08ddc2
Show file tree
Hide file tree
Showing 5 changed files with 1,136 additions and 70 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ erl_crash.dump
*.ez
/doc
.elixir_ls
/node_modules
77 changes: 58 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,32 @@ A wrapper for wkhtmltopdf (HTML to PDF) and PDFTK (adds in encryption) for use
in Elixir projects. If available, it will use xvfb-run (x virtual frame buffer)
to use wkhtmltopdf on systems that have no X installed, e.g. a server.

# New in 0.4.0 - remove misc_random, require Elixir v1.1

- 0.4.0
- Got rid of misc_random dependency. This was here to manage between
depreciated random functions in Erlang. We go ahead using plain
`Enum.random/1` instead, implementing our own
`PdfGenerator.Random.string/1` function. This also removes a common
pitfall when drafting a release with distillery.
* Thanks to [Hugo Maia Vieira](https://github.com/hugomaiavieira) for this
contribution!
* Since `Enum.random/1` is only available since September 2015 (three
years ago) I am OK with raising the minimum Elixir version to v1.1 –
Since this may break projects still running on Elixir v1.0
**I bumped the version to 0.4.0***.
# New in 0.5.0 - farewell Porcelain, hello chrome-headless

- 0.5.0
- **Got rid of Porcelain** dependency as it interferes with many builds using
plain `System.cmd/3`. Please note, that as of the documentation
(https://hexdocs.pm/elixir/System.html#cmd/3) ports will be closed but in
case wkhtmltopdf somehow hangs, nobody takes care of terminating it.
- Refactored some sections
- **Support URLs** instead of just plain HTML
- **Support for chrome-headless** for (at least for me) faster and nicer renderings.
- Since this is hopefully helpful, I rose the version to 0.5.0 even tough
the API stays consistent

For a proper changelog, see [CHANGES](CHANGES.md)

# System prerequisites
# System prerequisites (either wkhtmltopdf or nodejs)

Download wkhtmltopdf and place it in your $PATH. Current binaries can be found
here: http://wkhtmltopdf.org/downloads.html

**OR***

Run `npm install`. This requires [nodejs](https://nodejs.org), of course. This
will install a recent chromium and chromedriver to run Chrome in headless mode
and use this browser and its API to print PDFs.

_(optional)_ To use wkhtmltopdf on systems without an X window server installed,
please install `xvfb-run` from your repository (on Debian/Ubuntu: `sudo apt-get
install xvfb`).
Expand Down Expand Up @@ -54,7 +58,7 @@ Add this to your dependencies in your mix.exs:
defp deps do
[
# ... whatever else
{ :pdf_generator, ">=0.4.0" }, # <-- and this
{ :pdf_generator, ">=0.5.0" }, # <-- and this
]
end

Expand All @@ -72,6 +76,22 @@ html = "<html><body><p>Hi there!</p></body></html>"
filename = PdfGenerator.generate! html
```

Or, pass some URL

```
url = "http://google.com"
{ :ok, filename } = PdfGenerator.generate {:url, url}, page_size: "A5"
...
```

Or, use chrome-headless

```
url = "http://google.com"
{ :ok, filename } = PdfGenerator.generate {:url, url}, page_size: "A5", renderer: :chrome
...
```

Or use the bang-methods:

```
Expand All @@ -90,8 +110,18 @@ config :pdf_generator,
wkhtml_path: "/usr/bin/wkhtmltopdf", # <-- this program actually does the heavy lifting
pdftk_path: "/usr/bin/pdftk" # <-- only needed for PDF encryption
```

or, if you prefer shrome-headless

```
config :pdf_generator,
use_chrome: true # <-- will be default by 0.6.0
pdftk_path: "/usr/bin/pdftk" # <-- only needed for PDF encryption
```

## Running headless (server-mode)
## Running wkhtml headless (server-mode)

This section only applies to `wkhtmltopdf` users.

If you want to run `wkhtmltopdf` with an unpatched verison of webkit that requires
an X Window server, but your server (or Mac) does not have one installed,
Expand Down Expand Up @@ -120,14 +150,23 @@ config :pdf_generator,

## More options

- `page_size`: defaults to `A4`, see `wkhtmltopdf` for more options
- `filename` - filename for the output pdf file (without .pdf extension, defaults to a random string)

- `page_size`:
* defaults to `A4`, see `wkhtmltopdf` for more options
* A4 will be translated to `page-height 11` and `page-width 8.5` when
chrome-headless is used

- `open_password`: requires `pdftk`, set password to encrypt PDFs with

- `edit_password`: requires `pdftk`, set password for edit permissions on PDF

- `shell_params`: pass custom parameters to `wkhtmltopdf`. **CAUTION: BEWARE OF SHELL INJECTIONS!**

- `command_prefix`: prefix `wkhtmltopdf` with some command or a command with options
(e.g. `xvfb-run -a`, `sudo` ..)

- `delete_temporary`: immediately remove temp files after generation
- `filename` - filename for the output pdf file (without .pdf extension, defaults to a random string)

## Heroku Setup

Expand Down
153 changes: 102 additions & 51 deletions lib/pdf_generator.ex
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,8 @@ defmodule PdfGenerator do
# System requirements
- wkhtmltopdf
- wkhtmltopdf or chrome-headless
- pdftk (optional, for encrypted PDFs)
- goon (optional, for Porcelain shalle wrapper)
Precompiled **wkhtmltopdf** binaries can be obtained here:
http://wkhtmltopdf.org/downloads.html
Expand Down Expand Up @@ -74,6 +73,8 @@ defmodule PdfGenerator do
Supervisor.start_link(children, opts)
end

def defaults(), do: [generator: :wkhtmltopdf, page_size: "A4"]

# return file name of generated pdf

@doc """
Expand Down Expand Up @@ -105,45 +106,105 @@ defmodule PdfGenerator do
filename: "my_awesome_pdf"
)
"""
def generate( html ) do
generate html, page_size: "A4"
end

def generate( html, options ) do
generate(:wkhtmltopdf, html, options )
@type url :: binary()
@type html :: binary()
@type pdf_file_path :: binary()
@type content :: html | {:ok, url}
@type reason :: atom() | {atom(), any}
@type opts :: keyword()
@type path :: binary()
@type html_path :: path
@type pdf_path :: path
@type generator :: :wkhtmltopdf | :chrome

@spec generate(content, opts) :: {:ok, pdf_file_path} | {:error, reason}
def generate(content, opts \\ []) do

options = Keyword.merge(defaults(), opts)

generator = options[:generator]

open_password = options[:open_password]
edit_password = options[:edit_password]
delete_temp = options[:delete_temporary]

with {html_file, pdf_file} <- make_file_paths(options),
:ok <- maybe_write_html(content, html_file),
{executable, arguments} <- make_command(generator, options, content, {html_file, pdf_file}),
{console_stderr, exit_code} <- System.cmd(executable, arguments, stderr_to_stdout: true), # unfortuantely wkhtmltopdf returns 0 on errors as well :-/
{:result_ok, true} <- {:result_ok, result_ok(generator, console_stderr, exit_code)}, # so we inspect stderr instead
{:rm, :ok} <- {:rm, maybe_delete_temp(delete_temp, html_file)},
{:ok, encrypted_pdf} <- maybe_encrypt_pdf(pdf_file, open_password, edit_password) do
{:ok, encrypted_pdf}
else
{:error, reason} -> {:error, reason}
reason -> {:error, reason}
end
end

def generate(:wkhtmltopdf, html, options ) do
wkhtml_path = PdfGenerator.PathAgent.get.wkhtml_path
filebase = generate_filebase(options[:filename])
html_file = filebase <> ".html"
pdf_file = filebase <> ".pdf"
File.write html_file, html
@spec maybe_write_html(content, path()) :: :ok | {:error, reason}
def maybe_write_html({:url, _url}, _html_file_path), do: :ok
def maybe_write_html({:html, html}, html_file_path), do: File.write(html_file_path, html)
def maybe_write_html(html, html_file_path) when is_binary(html), do: maybe_write_html({:html, html}, html_file_path)

shell_params = [
"--page-size", Keyword.get( options, :page_size ) || "A4",
Keyword.get( options, :shell_params ) || [] # will be flattened
]
@spec make_file_paths(keyword()) :: {html_path, pdf_path}
def make_file_paths(options) do
filebase = options[:filename] |> generate_filebase()
{filebase <> ".html", filebase <> ".pdf"}
end

executable = wkhtml_path
arguments = List.flatten([shell_params, html_file, pdf_file])
command_prefix = get_command_prefix(options)
def make_dimensions(options) when is_list(options) do
options |> Enum.into(%{}) |> dimensions_for()
end

open_password = Keyword.get options, :open_password
edit_password = Keyword.get(options, :edit_password)
delete_temp = Keyword.get(options, :delete_temporary)
@doc ~s"""
Returns `{width, height}` tuple for page sizes either as given or for A4 and
A5. Defaults to A4 sizes.
"""
def dimensions_for(%{page_width: width, page_height: height}), do: {width, height}
def dimensions_for(%{page_size: "A4"}), do: {"8.50", "11.0"}
def dimensions_for(%{page_size: "A5"}), do: {"4.25", "5.5"}
def dimensions_for(_map), do: dimensions_for(%{page_size: "A4"})

@spec make_command(generator, opts, content, {html_path, pdf_path}) :: {path, list()}
def make_command(:chrome, options, content, {html_path, pdf_path}) do
executable = System.find_executable("chrome-headless-render-pdf")
{width, height} = make_dimensions(options)
more_params = options[:shell_params] || []
source =
case content do
{:url, url} -> url
_html -> "file://" <> html_path
end
arguments = [
"--url", source,
"--pdf", pdf_path,
"--paper-width", width,
"--paper-height", height,
] ++ more_params
{executable, arguments}
end

with {executable, arguments} <- make_command_tuple(command_prefix, executable, arguments),
{console_stderr, _code} <- System.cmd(executable, arguments, stderr_to_stdout: true), # unfortuantely wkhtmltopdf returns 0 on errors as well :-/
{:stderr_good, {true, _x}} <- {:stderr_good, result_ok(:wkhtmltopdf, console_stderr)}, # so we inspect stderr instead
{:rm, :ok} <- {:rm, maybe_delete_temp(delete_temp, html_file)},
{:ok, pdf} <- maybe_encrypt_pdf(pdf_file, open_password, edit_password) do
{:ok, pdf}
else
{:error, reason} -> {:error, reason}
reason -> {:error, reason}
def make_command(:wkhtmltopdf, options, content, {html_path, pdf_path}) do
executable = PdfGenerator.PathAgent.get.wkhtml_path
source =
case content do
{:url, url} -> url
_html -> html_path
end
more_params = options[:shell_params] || []
arguments = [
"--page-size", options[:page_size] || "A4",
source, pdf_path
] ++ more_params

# for wkhtmltopdf we support prefixes like ["xvfb-run", "-a"] to precede the actual command
case get_command_prefix(options) do
nil -> {executable, arguments}
[prefix | prefix_args] -> {prefix, prefix_args ++ [executable] ++ arguments}
prefix -> {prefix, [executable | arguments]}
end

end

defp maybe_delete_temp(true, file), do: File.rm(file)
Expand All @@ -158,30 +219,20 @@ defmodule PdfGenerator do
{:ok, pdf_file}
end

def result_ok(:wkhtmltopdf, string) do
{String.match?(string, ~r/Done/ms), string}
end

def get_command_prefix(options) do
Keyword.get( options, :command_prefix ) || Application.get_env( :pdf_generator, :command_prefix )
end
defp result_ok(:chrome, _string, 0), do: true
defp result_ok(:chrome, _string, _exit_code), do: false
defp result_ok(:wkhtmltopdf, string, _exit_code), do: String.match?(string, ~r/Done/ms)

def make_command_tuple(_command_prefix = nil, wkhtml_executable, arguments) do
{ wkhtml_executable, arguments }
end
def make_command_tuple([command_prefix | args], wkhtml_executable, arguments) do
{ command_prefix, args ++ [wkhtml_executable] ++ arguments }
end
def make_command_tuple(command_prefix, wkhtml_executable, arguments) do
{ command_prefix, [wkhtml_executable] ++ arguments }
defp get_command_prefix(options) do
options[:command_prefix] || Application.get_env(:pdf_generator, :command_prefix)
end

defp generate_filebase(nil), do: generate_filebase(PdfGenerator.Random.string())
defp generate_filebase(filename), do: Path.join(System.tmp_dir, filename)

def encrypt_pdf( pdf_input_path, user_pw, owner_pw ) do
pdftk_path = PdfGenerator.PathAgent.get.pdftk_path
pdf_output_file = Path.join System.tmp_dir, PdfGenerator.Random.string() <> ".pdf"
def encrypt_pdf(pdf_input_path, user_pw, owner_pw ) do
pdftk_path = PdfGenerator.PathAgent.get.pdftk_path
pdf_output_file = Path.join System.tmp_dir, PdfGenerator.Random.string() <> ".pdf"

pdftk_args = [
pdf_input_path,
Expand Down
Loading

0 comments on commit a08ddc2

Please sign in to comment.