Skip to content

Latest commit

 

History

History
299 lines (219 loc) · 12.6 KB

README.md

File metadata and controls

299 lines (219 loc) · 12.6 KB

prerendercloud-server

Github Actions CI

Dockerhub: https://hub.docker.com/r/prerendercloud/webserver

This package is a Node.js pushstate http server powered by Headless-Render-API.com (formerly named prerender.cloud from 2016 - 2022). It's also the actual server hosting the headless-render-api.com domain (via fly.io).

Use it for server-side rendering (also known as pre-rendering or dynamic rendering) your single-page JavaScript application (React, Angular, Ember, Preact, Vue, etc.)

# simplest possible example if your built SPA is in a dir named `dist`
npm install -g prerendercloud-server
PORT=9000 prerendercloud-server dist

# now serving your JavaScript single-page app at localhost:9000
# if running from your dev machine (localhost) start a reverse tunnel to get a public IP:
# e.g. 1 if you have ngrok.com account: `ngrok http 9000`
# e.g. 2 if you have a VPS, something like: `ssh user@www.myremotehost.com -R 9000:localhost:9000`

Designed to be an all-in-one hosting + server-side rendering solution for single-page JavaScript apps needing pre-rendering or a generic solution to server-side rendering. Run it from Node.js or as a Docker container. Based off the official Headless-Render-API.com middleware

Requirements

  • index.html at the root of the deployed project
  • pushstate URLs
  • React, Preact, Angular, Ember, Vue, or any SPA framework that rewrites a container DOM element (Angular users must use templates)

Notes on caching and pre-rendering lifecyle

By default, this package has no "API request caching" enabled (it does have etags for static files). This means 100% of requests will be forwarded and processed by Headless-Render-API.com's API (service.headless-render-api.com). This is the ideal configuration while you're getting things working, but not for production.

Once your app is pre-rendering as you expect, and you're ready to "go to production", use the --enable-middleware-cache option. This is an in-memory cache of the responses from requests made to the service.headless-render-api.com API. Note, there is also a "server cache" available from Headless-Render-API.com but that is disabled here as a best practice (caching locally via middleware cache is free to you, but using the Headless-Render-API.com server cache costs money).

Simply restart and/or deploy this process to clear that in-memory cache.

Pages are pre-rendered "on-demand", also known as "lazy loading". So if you visit /docs, that request will block until the pre-render is complete. If --enable-middleware-cache is set, then subsequent requests to /docs will come from your local cache (until the process is rebooted or the cache expires).

If you'd like to restrict pre-rendered content to "bots only", use the --bots-only config. See the list of bots here.

If you'd like to restrict which pages are valid for pre-rendering, see the _whitelist.js config below. If you have a busy site, this is an important feature to enable to prevent abusive bots from spamming random URLs that may not actually exist causing needless requests to be made to service.headless-render-api.com.

Read all documentation here: https://headless-render-api.com/docs and read more about the config options here: https://github.com/sanfrancesco/prerendercloud-nodejs

Plain old Node.js examples

npm install -g prerendercloud-server

now navigate to your project directory (unless you're using S3, in which case it doesn't matter)

usage: prerendercloud-server [options] [LocalPath or S3Uri]

# start the server in the current directory
prerendercloud-server

# start the server for the dist directory
prerendercloud-server dist

# start the server for the dist directory and run it on PORT 9000
PORT=9000 prerendercloud-server dist

# start the server for the dist directory and run it on PORT 9000 and use the local cache
# (the cache won't expire until you terminate this node instance)
PORT=9000 prerendercloud-server dist --enable-middleware-cache

# start the server in the current directory with your API token
# from https://headless-render-api.com to avoid rate limits
PRERENDER_TOKEN=my-secret-token prerendercloud-server

Plain old Node.js local filesystem example

PRERENDER_TOKEN="my-secret-token" \
prerendercloud-server . \
--enable-middleware-cache \
--disable-ajax-preload \
--disable-ajax-bypass \
--bots-only

Plain old Node.js S3 proxy example

AWS_ACCESS_KEY="my-aws-key" \
-e AWS_SECRET_KEY="my-aws-secret" \
-e PRERENDER_TOKEN="my-secret-token" \
prerendercloud-server \
s3://my-s3-bucket \
--enable-middleware-cache \
--disable-ajax-preload \
--disable-ajax-bypass \
--bots-only

Fly.io example

Fly.io is a modern app deployment platform that can run Dockerfiles and is a painless way to run prerendercloud-server.

See fly-io in the examples directory

Docker local filesystem example

Mount the directory from your laptop/server into the Docker container at path /wwwroot

This example assumes you're serving the dist directory from the directory you're launching your Docker container.

docker run \
  --rm \
  --name=prerendercloud-webserver \
  -e PRERENDER_TOKEN="my-secret-token" \
  -e DEBUG=prerendercloud \
  -p 9000:9000 \
  -v $(pwd)/dist:/wwwroot \
  prerendercloud/webserver \
  --enable-middleware-cache \
  --disable-ajax-preload \
  --disable-ajax-bypass \
  --bots-only

Docker S3 proxy example

Note: the S3 proxy feeature does not cache data from S3 in the container, although it respects etags (if the client/browser sends if-none-match, and S3 returns 304 not modified, then the proxy returns 304 not modified). This means that this container does not need to be restarted when updating content on S3.

docker run \
  --rm \
  --name=prerendercloud-webserver \
  -e AWS_ACCESS_KEY="my-aws-key" \
  -e AWS_SECRET_KEY="my-aws-secret" \
  -e PRERENDER_TOKEN="my-secret-token" \
  -e DEBUG=prerendercloud \
  -p 9000:9000 \
  prerendercloud/webserver \
  s3://my-s3-bucket \
  --enable-middleware-cache \
  --disable-ajax-preload \
  --disable-ajax-bypass \
  --bots-only

Environment variables

  • PORT - default 9000
  • PRERENDER_TOKEN - need this avoid rate limiting, get an API token from https://headless-render-api.com/
  • MIDDLEWARE_CACHE_MAX_MEGABYTES - used with --enable-middleware-cache, default is 500
  • CANONICAL_HOST - if exists, requests made to the server from a non-matching host header will redirect to canonical.
    • most common use case: configure your DNS to point apex and www to example.com, set CANONICAL_HOST=example.com, and requests to www.example.com will redirect to apex
    • override the header used to detect host with HOST_HEADER (defaults to host, if on AWS behind ALB, set HOST_HEADER=x-forwarded-proto)
  • CRAWL_HOST - if using --crawl-whitelist-on-boot, e.g. CRAWL_HOST=example.com (no protocol, no slashes)
    • use with DISABLE_PRERENDERING_FOR_SECONDS to give your process enough time to boot and go live (35s is a safe/common value)
  • DISABLE_PRERENDERING_FOR_SECONDS is null by default, but if you're running in production with zero-downtime deploys (Kubernetes, Nomad, Fly.io, AWS Beanstalk, etc.), then set this to avoid initial requests hitting the old version of your app during the deploy
  • AWS_ACCESS_KEY and AWS_SECRET_KEY if using s3 proxy

Options

Read more about these options here: https://github.com/sanfrancesco/prerendercloud-nodejs

  • --help
  • --debug
    • verbose debugging
  • --enable-middleware-cache
    • a local in-memory cache that does not expire (reboot to clear cache) to avoid hitting service.headless-render-api.com on every request
  • --meta-only
    • when you only want to pre-render the <head /> (useful if all you care about is open graph and meta tags)
  • --bots-only
  • --ignore-all-query-params
  • --remove-trailing-slash
  • --disable-ajax-preload
  • --disable-ajax-bypass
  • --disable-head-dedupe
  • --remove-script-tags
  • --wait-extra-long
    • if the pre-rendering process finished too early
  • --follow-redirects
  • --bubble-up-5xx-errors
  • --throttle-on-fail
  • --crawl-whitelist-on-boot
    • requires a _whitelist.js file to exist in wwwroot and CRAWL_HOST env var to be set, e.g. CRAWL_HOST=example.com (no protocol, no slashes)
    • use this with --enable-middleware-cache so visitors don't have to wait for the lazily loaded pre-rendering to finish

The _whitelist.js file

This project will parse an actual JavaScript file (not JSON) with the filename _whitelist.js if in the wwwroot (same place as your index.html).

This file configures the whitelistPaths option of the underlying prerendercloud-nodejs middleware. It reduces your potential billing/costs by preventing bots or bad actors from scraping random URLs.

Example _whitelist.js file:

// strings or regexes
// if this file doesn't exist or the array is empty, any requested path will be pre-rendered
module.exports = ["/", "/docs", /\/users\/\d{1,6}\/profile\/?$/];

The _redirects file

Similar to Netlify's _redirects file, this project will parse a _redirects file in the wwwroot (same place as your index.html).

Why use this? For redirects, rewrites. This includes avoiding CORS configuration by redirecting a same origin path to a remote API. In other words, some additional control over routing logic.

  • A plain text file in the root of your deploy with the file name _redirects for controlling routing logic
  • Each line is 3 fields separated by any amount of white space:
/source/path /destination/path statusCode
  • The /sourcePath and /destinationPath must start with /
  • The status code field is optional and if not specified, defaults to 301.
  • Using 200 as a status code is a "rewrite" (or proxy), the user will not see the final/true destination.
  • Comments start with #
  • White space around or between lines is ignored so use it for readability.
  • html file extension is optional

This rule is already included by default since this project is for single-page apps. Shown here only as an example of what it would look like if not already included.

/* /index.html 200

200 rewrite/proxy splat (wildcard)

  • (for avoiding CORS config on your server)
  • (wildcards (asterisks) can only be at the end of a sourcePath and if used, the destinationPath must have :splat at the end)
/api/v1/* http://example.com/api/v1/:splat 200

301 redirect /documentation to /docs

/documentation /docs

302 redirect /documentation to /docs

(same as above, but use 302 instead of the default of 301)

/documentation /docs 302

200 rewrite/proxy /documentation to /docs

/documentation /docs 200

The _headers file

Similar to Netlify's _headers file, this project will parse a _headers file in the wwwroot (same place as your index.html).

For setting cache-control headers (i.e. long max-age or immutable), overriding content-type that is normally detected from the filename. In other words, explicit control over the headers returned.

/path/must/start/with-leading-slash-optional-trailing-star*
  header-key-then-colon: header-value

Give everything in the fonts dir a long/permanent cache

/fonts/*
  Cache-Control: public, max-age=365000000, immutable

Give a content-type to a path without an extension

/apple-app-site-association
  content-type: application/json