Carbone PDF Render Lambda (POC)

Renders a DOCX template to PDF using Carbone and the Shelf LibreOffice Lambda Layer.

New: Multi‑Template & Marker Discovery

The service now supports multiple .docx templates placed in the build templates directory. A new endpoint GET /templates returns a JSON list of available templates and their discovered markers (Carbone placeholders like {d.fullName} → marker fullName).

The interactive form (GET /) loads the template list, lets you switch between them, and dynamically filters the editable field list to just the markers present in the selected template. Any marker that has no example value is auto‑initialised with a placeholder value of (( markerName )) so you can see it clearly in the UI.

When you submit a render request the chosen template name is included as template (either query param, form field, or JSON body property). If omitted the first template found (alphabetical) is used.

Example /templates response:

{
  "templates": [
    {
      "name": "letter-template-nhs-notify_",
      "file": "letter-template-nhs-notify_.docx",
      "size": 54321,
      "markers": ["fullName", "firstName", "address_line_1"]
    }
  ]
}

Add additional templates by placing more .docx files into src/modules/templates/ (or the equivalent built output directory used during packaging). Rebuild / redeploy and they will appear automatically.

Features

Node.js 20.x Lambda (x86_64)
LibreOffice provided by external layer: arn:aws:lambda:eu-west-2:764866452798:layer:libreoffice-brotli:1
Carbone rendering of a bundled DOCX template (templates/letter-template-nhs-notify_.docx)
Outputs PDF (base64) via Lambda URL (proxy style response)
Structured JSON logging
Warm-up initialization outside the handler to minimise cold-start render time
LibreOffice archive extraction only once per cold start (cached in container /tmp)
Local invocation helper with a minimal placeholder PDF (when SKIP_CONVERT=1)

LibreOffice Layer Handling

The LibreOffice Lambda layer ships a compressed archive at /opt/lo.tar.br (or /opt/lo.tar.gz). On a cold start the function:

Detects whether LibreOffice is already extracted under /tmp/libreoffice/instdir/program.
If not, reads and decompresses the archive (Brotli or Gzip) into /tmp/libreoffice (<=512 MB ephemeral storage).
Adds the discovered instdir/program path to PATH so soffice.bin is invokable by Carbone's convert step.
Logs extraction duration; subsequent warm invocations skip this step (fast path).

Local development placeholder mode (SKIP_CONVERT=1) skips the extraction entirely and returns a tiny static PDF to allow rapid iteration without the layer or native binary.

Project Structure

src/                # Lambda handler (index.ts) + modules + utils
scripts/            # build, package, local-invoke scripts
infra/              # Terraform configuration
templates/          # DOCX template included in deployment
package/            # Build output (not committed)
lambda.zip          # Deployment artifact generated by scripts/package.mjs

Prerequisites

Node.js 20.x
npm
Terraform >= 1.5
AWS credentials with permission to create IAM roles, Lambda functions, and Lambda URLs

Install Dependencies

npm install

Build & Package (locally)

npm run package   # runs clean + build + zip creation
ls -lh lambda.zip

Local Test (without LibreOffice)

This uses a tiny inline PDF generator (not real conversion) to validate the flow.

npm run build
node scripts/local-invoke.mjs '{"data":{"exampleName":"Alice"}}'
open local-output.pdf # macOS only

Deploy with Terraform

cd infra
terraform init
terraform apply -auto-approve

Outputs:

lambda_function_url – Invoke with curl.

Invoke Deployed Lambda

LAMBDA_URL="<paste output>"
curl -s -X POST "$LAMBDA_URL" \
  -H 'content-type: application/json' \
  -d '{"data":{"firstName":"Alice","score":42}}' \
  -o output.pdf
open output.pdf # macOS only

Handler Contract

Request (Lambda URL / Function URL invokes with standard proxy body):

{ "data": { "firstName": "Alice" } }

Response (success): HTTP 200, Content-Type: application/pdf, base64 body. Errors: JSON {"message":"..."} with 400 or 500.

Input Form (GET)

A GET request to the Lambda URL returns an HTML page (not JSON) with:

Current status flags (LibreOffice extracted, template present)
A textarea form pre-populated with sample JSON
A POST target that submits as application/x-www-form-urlencoded using dataJson field

Open directly in a browser:

open "$LAMBDA_URL"  # or visit in browser

Fetch raw HTML:

curl -s "$LAMBDA_URL" | head -n 20

Submitting the form opens the rendered PDF in a new tab (inline).

Default Data Fallback

POST requests with ANY of the following are treated as a request to render the template with default mock data:

Empty body (zero-length)
Whitespace-only body
Body that parses to JSON without a data property (e.g. {}) In these cases a default structure like:

{
  "example": "default-render",
  "generatedAt": "2025-10-08T08:00:00.000Z"
}

is passed to Carbone. Logs include defaultUsed: true for observability.

To force custom data, send a JSON body containing a data object:

curl -s -X POST "$LAMBDA_URL" \
  -H 'content-type: application/json' \
  -d '{"data":{"patientName":"Jane Doe","score":98}}' \
  -o output.pdf

Local Testing Shortcuts

HTML input form page (writes local-health.html):

npm run build
node scripts/local-invoke.mjs --get
open local-health.html  # macOS

Form POST simulation (x-www-form-urlencoded):

node scripts/local-invoke.mjs --form '{"data":{"fromForm":true,"value":42}}'
open local-output.pdf

Empty POST (default data):

node scripts/local-invoke.mjs

Explicit empty JSON (still default data):

node scripts/local-invoke.mjs '{}'

GET input form (HTML):

node scripts/local-invoke.mjs --get

Invalid JSON (expect 400):

node scripts/local-invoke.mjs '{invalid'

Environment / Performance

Memory: 2048 MB (per Carbone guidance for parallelism and speed)
Timeout: 30s (adjust if large templates or complex formatting)
Ephemeral storage: default (increase if larger intermediate files appear)

Notes / Trade-offs

Node modules installed production-only during build (no dev dependencies) for smaller artifact.
Carbone version pinned via semver range ^3.5.6 (latest available as of scaffold).
No authentication on Lambda URL (public). Add IAM or AWS_IAM / custom auth before production use.
A single fixed template; extend by allowing template selection via request parameter.

Extending

Add API Gateway HTTP API if needing custom domains / auth.
Add CloudWatch log metrics (parse JSON logs for latency & failures).
Add unit tests (e.g., using Vitest or Jest) for request parsing and error paths.
- (Added) Jest setup with marker extraction tests (see below)
Implement template caching / compiled template strategy if Carbone supports it to reduce repeated parsing overhead.

Clean Up

cd infra
terraform destroy -auto-approve

Security Considerations

Ensure input data is validated if later exposing publicly.
Sanitize or restrict dynamic content to avoid injection in documents.

License

POC - internal use. Review Carbone and LibreOffice licensing for distribution compliance.

Tests (Marker Extraction)

Jest-based tests cover template marker extraction logic.

Run:

npm test

What they check:

getTemplatesDir returns an absolute path.
listTemplates returns structured objects with sorted, unique markers.
(Conditional) At least one template produces a non-empty marker list.
Cache stability via ensureTemplateInfo (same size -> same markers result).
Missing template path throws.

If no .docx exists in the runtime templates directory the “non‑empty marker” and cache tests are skipped (log a warning) so CI can still pass without committing binary templates if desired.

Using AWS SSO (aws_profile)

If you use AWS SSO (IAM Identity Center) with a profile (e.g. nhs-notify-poc):

Log in via AWS SSO first:
```
aws sso login --profile nhs-notify-poc
```
Either export the profile environment variable (works without changing Terraform vars):
```
export AWS_PROFILE=nhs-notify-poc
```
Then run:
```
cd infra
terraform apply -auto-approve
```

Or explicitly set the Terraform variable we added:

cd infra
terraform apply -var="aws_profile=nhs-notify-poc" -auto-approve

Troubleshooting "No valid credential sources found":

Ensure you ran aws sso login recently (tokens expire, usually after 8/12 hours).
Confirm AWS CLI v2 is installed: aws --version.
Check profile config in ~/.aws/config has sso_session, sso_account_id, sso_role_name, and region.
You can set AWS_SDK_LOAD_CONFIG=1 to force full shared config loading:
```
export AWS_SDK_LOAD_CONFIG=1
```

Run a quick permission test:

aws sts get-caller-identity --profile nhs-notify-poc

If the above works, Terraform should also succeed with the same profile.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
infra		infra
scripts		scripts
src		src
templates		templates
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
jest.config.cjs		jest.config.cjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Carbone PDF Render Lambda (POC)

New: Multi‑Template & Marker Discovery

Features

LibreOffice Layer Handling

Project Structure

Prerequisites

Install Dependencies

Build & Package (locally)

Local Test (without LibreOffice)

Deploy with Terraform

Invoke Deployed Lambda

Handler Contract

Input Form (GET)

Default Data Fallback

Local Testing Shortcuts

Environment / Performance

Notes / Trade-offs

Extending

Clean Up

Security Considerations

License

Tests (Marker Extraction)

Using AWS SSO (aws_profile)

About

Uh oh!

Releases

Packages

Languages

m-houston/carbone-lambda-poc

Folders and files

Latest commit

History

Repository files navigation

Carbone PDF Render Lambda (POC)

New: Multi‑Template & Marker Discovery

Features

LibreOffice Layer Handling

Project Structure

Prerequisites

Install Dependencies

Build & Package (locally)

Local Test (without LibreOffice)

Deploy with Terraform

Invoke Deployed Lambda

Handler Contract

Input Form (GET)

Default Data Fallback

Local Testing Shortcuts

Environment / Performance

Notes / Trade-offs

Extending

Clean Up

Security Considerations

License

Tests (Marker Extraction)

Using AWS SSO (aws_profile)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages