Skip to content

feat: use jwks to fetch public keys #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 38 commits into
base: develop
Choose a base branch
from
Draft

feat: use jwks to fetch public keys #113

wants to merge 38 commits into from

Conversation

littlemight
Copy link
Collaborator

@littlemight littlemight commented Feb 14, 2025

Problem

Tech debt, public key is hardcoded in the SDK. If our signing keys are compromised, rotation would be such a big hassle. Modifying the SDK to fetch the keys from an publicly accessible JWKS endpoint instead.

Closes FRM-876
Hopefully closes #53, #74, #75, #84.

Reference

RFC for JWKS standard
RFC for representing Ed25519 keys
These are not hard guides to follow, as we can freely define our own JWKS schema that's convenient for us. It's just a general guard rails and starting point.

Solution

All existing public keys are hosted on a public jwks endpoint. During SDK init it will fetch and cache, and re-fetched whenever cache goes stale. This should also help reducing manual intervention during key rotation.

Features:

Exit criteria should be

  • [~] Sane migration guide
  • Keys from JWKS will take precedence over hardcoded configs
  • On failure, hardcoded keys from file/env will be used as fallback
  • Each of the modules should have their public key refetched instead of using the previously hardcoded value during module init
  • During rotation, and array of keys instead will be used. SDK will then try till it finds a key that works.
  • Alternatively, we identify via key id (kid)

Improvements:

  • Configurable timeout when fetching JWKS
  • Configurable cache duration
  • A sane upper bound for cache duration to prevent stale keys living too long?
    I'm not considering node-cache for caching, as it is unmaintained.

Bug Fixes:

  • Details ...

Before & After Screenshots

BEFORE:
[insert screenshot here]

AFTER:
[insert screenshot here]

Tests

What tests should be run to confirm functionality?
Unit tests... which I haven't written 🤡
Unit and integration tests.

Deploy Notes

Notes regarding deployment of the contained body of work. These should note any
new dependencies, new scripts, etc.

New environment variables:

  • env var : env var details

New scripts:

  • script : script details

New dependencies:

  • axios-retry: lightweight lib to configure axios client for plentiful retry options like exp backoff

New dev dependencies:

  • nock: mocking JWKS server in integration tests

@littlemight littlemight changed the title feat: use jwks as public key feat: use jwks to fetch public keys Feb 14, 2025
Copy link

linear bot commented Feb 14, 2025

@littlemight littlemight self-assigned this Feb 14, 2025
@littlemight
Copy link
Collaborator Author

littlemight commented Feb 15, 2025

braindump;
There's two ways we could populate value of the keys

  1. at bootstrap, fetch from jwks then just pass it as static value
  2. pass the public key getters to the modules (Crypto, Webhook, Verification)

1 favors simplicity, but in the case of key rotation, a redeployment need to happen.
2 is robust, as in keys are fetched lazily and stale keys eventually get overriden in the case of key rotation.

however, due to the nature of fetching the keys (async requests), a bunch of fns that weren't async are now async. changing the function signatures a bit and thus, a bit more work for people bumping the sdk to adjust to this version.

I think 2 should be the play, given the robustness and zero-downtime nature during rotation.

package.json Outdated
@@ -1,6 +1,6 @@
{
"name": "@opengovsg/formsg-sdk",
"version": "0.12.0",
"version": "0.14.0-alpha.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/nit because we're changing the function definition (from sync to async), I think this warrants a bump on the major version to 1.0.0 (can declare that as 1.0.0-alpha.0 or 1.0.0-rc.0).

We'll need a changelog with migration notes as well, this can be done later! 🙏

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely!
migration notes as in smth like MIGRATION.md, form guide, or both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIGRATION.md like storybook or better

src/crypto.ts Outdated
@@ -87,7 +91,15 @@ export default class Crypto extends CryptoBase {
}

if (verifiedContent) {
if (!this.signingPublicKey) {
if (!this.getSigningPublicKey) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done on the constructor instead? The crypto class should not be instantiated if there's no function provided. This allows us to throw the error earlier rather than later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was doing exactly this in the earlier commits! but then it broke this test, where it allows class instantiation without a public key but "disallows" certain function being called (....by throwing an error).

I assume it's a tradeoff made at the time? So to keep behaviour backward compatible I decided not too mess around with that area of the code.

But I feel this is somewhat a confusing pattern, as every potential downstream method of the class have to be wary of that property.

Anyways since we're jumping to a major version already (from 0.13.0 -> 1.0), we can consider changing this particular behaviour and adjusting that test :p

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I feel that it doesn't really make sense to allow incomplete instantiation resulting in downstream function having to cleanup.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paiseh just got time to dedicate wrapping up work on this PR.

@KenLSM just had a revelation why it's coded this way - SDK is used in 3 places:

  1. Form frontend: admin page, for decrypting responses. Doesn't need public signing key.
  2. Form backend: for crafting signatures andallthat. Doesn't need public signing key.
  3. Webhook consumer: for verifying said signatures. Need public signing key.

So the same class is conditionally instantiated for 2 different usecases, which in nature requires different prerequisite.

Let me think the best way to approach this, if the pursuit of cleanliness ends up introducing a fat refactor on the main form codebase then I would defer the change and take baby steps instead.

@@ -47,7 +51,11 @@ export default class Verification {
)
}

if (!this.verificationPublicKey) {
if (!this.getVerificationPublicKey) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion as Crypto class, can this be checked in the constructor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +50 to +65
(req, res, next) => {
try {
const signature = req.get('X-FormSG-Signature')
if (!signature) {
console.error('No signature found in headers')
return res.status(401).send({ message: 'Signature missing' })
}

formsg.webhooks.authenticate(signature, webhookUrl)
console.log('Webhook authenticated successfully')
return next()
} catch (e) {
console.error('Authentication failed:', e)
return res.status(401).send({ message: 'Unauthorized' })
}
},

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 4 months ago

To fix the problem, we need to introduce rate limiting to the Express application. The best way to do this is by using the express-rate-limit package, which allows us to easily set up rate limiting middleware. We will configure the rate limiter to allow a maximum of 100 requests per 15 minutes and apply it to all routes in the application.

  1. Install the express-rate-limit package.
  2. Import the express-rate-limit package in the examples/webhook-server.ts file.
  3. Set up the rate limiter with the desired configuration.
  4. Apply the rate limiter middleware to the Express application.
Suggested changeset 2
examples/webhook-server.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/examples/webhook-server.ts b/examples/webhook-server.ts
--- a/examples/webhook-server.ts
+++ b/examples/webhook-server.ts
@@ -3,2 +3,3 @@
 import formSgSDK from '../src'
+import rateLimit from 'express-rate-limit'
 
@@ -9,2 +10,11 @@
 
+// Set up rate limiter: maximum of 100 requests per 15 minutes
+const limiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // max 100 requests per windowMs
+})
+
+// Apply rate limiter to all requests
+app.use(limiter)
+
 // Get the form secret key from environment variables
EOF
@@ -3,2 +3,3 @@
import formSgSDK from '../src'
import rateLimit from 'express-rate-limit'

@@ -9,2 +10,11 @@

// Set up rate limiter: maximum of 100 requests per 15 minutes
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // max 100 requests per windowMs
})

// Apply rate limiter to all requests
app.use(limiter)

// Get the form secret key from environment variables
examples/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/examples/package.json b/examples/package.json
--- a/examples/package.json
+++ b/examples/package.json
@@ -12,3 +12,4 @@
     "express": "^4.18.2",
-    "dotenv": "^16.3.1"
+    "dotenv": "^16.3.1",
+    "express-rate-limit": "^7.5.0"
   },
EOF
@@ -12,3 +12,4 @@
"express": "^4.18.2",
"dotenv": "^16.3.1"
"dotenv": "^16.3.1",
"express-rate-limit": "^7.5.0"
},
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 7.5.0 None
Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sdk should add fallback public keys
2 participants