Skip to content

dce/pandoc-talk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pandoc: a Tool I Use and Like

David Eisinger
Dev Team Meeting, 5/24/2022


Pandoc: a Tool I Use and Like

What is Pandoc? From the project website:

If you need to convert files from one markup format into another, pandoc is your swiss-army knife.


Pandoc: a Tool I Use and Like

  • I spend a lot of time writing
    • and I love Vim, Markdown, and the command line
    • (and avoid browser-based WYSIWYG editors when I can)
  • But it has a ton of utility outside of that
    • Really, anywhere you need to move between different text-based formats, Pandoc can probably help

Markdown ➞ Craft Blog Post

I do all my blog writing in Vim/Markdown, then, when I'm ready to publish, convert to HTML with Pandoc, then paste that into a Craft text block.

This gets me a few things I really like:

Pandoc uses Pandoc Markdown by default, an "extended and slightly revised version" of the original syntax.


Markdown ➞ Craft Blog Post

Example post:

# Article Title

Here's an article -- it has "smart punctuation" and footnotes[^1].

[^1]: So fancy.

Convert to HTML:

cat examples/blog_post.md | pandoc -t html

Markdown ➞ Rich Text (Basecamp)

  • I also sometimes find myself writing long Basecamp posts
  • Basecamp 3 has a fine WYSIWYG editor (🪦 Textile)
    • but again, I'd rather be in Vim
  • Pasting HTML into Basecamp doesn't work
    • but you can convert MD to HTML
    • open in browser
    • then copy/paste

Markdown ➞ Rich Text (Basecamp)

Example post:

* **Team Member #1** is doing great.
* **Team Member #2** is also doing great.
* **Team Member #3** is similarly great.

***

The dev team is great.

Generate HTML preview:

cat examples/team_update.md \
  | pandoc -t html \
  > /tmp/output.html \
  && open /tmp/output.html \
  && read -n 1 \
  && rm /tmp/output.html

HTML ➞ Text

  • A client app receives news articles (in HTML)
    • and emails them out (as HTML and plain text)
  • The incoming articles often lack linebreaks
  • Running the article through Rails' strip_tags similarly has no linebreaks
  • This is unreadable
  • Pandoc can convert from HTML to plain text
  • And has nice Ruby bindings

HTML ➞ Text

Example article:

cat examples/article.html

Result of strip_tags:

cat examples/article.html \
  | ruby -e "require 'action_controller';
  puts ActionController::Base.helpers.strip_tags(STDIN.read)"

With Pandoc:

cat examples/article.html | pandoc -f html -t plain

HTML ➞ Text

In the app:

def formatted_plaintext
  @formatted_plaintext ||= PandocRuby.html(full_text).to_plain
end

HTML Element ➞ Text

  • Working on Thrillr
  • Needed a list of all TLDs available in AWS Route 53
  • Options were available in a <select> in the AWS console
  • You'll never guess what I did
    • (unless you guessed "use Pandoc")

HTML Element ➞ Text

  • Right click the select element, then click "Inspect"
  • Find the <select> in the DOM view that pops up
  • Right click it, then go to "Copy", then "Inner HTML"
  • You'll now have all of the <option> elements on your clipboard
pbpaste | pandoc -t plain --wrap none | sed 's/00/00\n/g'

(This worked without all the sed stuff originally, but the dropdown got fancy in the interim.)


Preview Mermaid/Markdown (--standalone)

  • Andrew and I were creating sequence diagrams with Mermaid
  • GitHub and GitLab both support Mermaid natively
    • But we wanted to be able to quickly iterate on the diagrams
  • We devised a simple build chain
    • Watch for changes to a Markdown file
    • Convert the Mermaid blocks to SVG
    • Use Pandoc to take the resulting document and convert it to a styled HTML page using the --standalone option

Generate a PDF

  • Pandoc also includes several ways to create PDF documents
  • The simplest (IMO) is to install wkhtmltopdf
    • then instruct Pandoc to convert its input to HTML
    • but use .pdf in the output filename
echo "# Hello\n\nIs it me you're looking for?" \
  | pandoc -t html -o hello.pdf

Closing Thoughts

  • Pandoc is incredibly powerful
    • I've really only scratched the surface here
    • Look at the man page for a sense of everything it can do
  • Pandoc is written in Haskell
    • The source is pretty fun to look through if you're a certain kind of person

Closing Thoughts

So install Pandoc with your package manager of choice and give it a shot. I think you'll find it unexpectedly useful.

Now I will take questions.

https://github.com/dce/pandoc-talk

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors