Skip to content

Conversation

@dominiquekleeven
Copy link
Collaborator

@dominiquekleeven dominiquekleeven commented May 14, 2025

Closes #3

Adds a custom middleware that protects all non-public endpoints via Keycloak.
keycloak/middleware.py

It uses the JWKS endpoint exposed by Keycloak for verifying the token.
More details: https://www.keycloak.org/securing-apps/oidc-layers — It is generally recommended to use the JWKS endpoint if the token contains all necessary details (e.g., claims and user information). Additionally, introspection requires more manual setup, as the service then requires its own client_id and client_secret.

Authentication Flow:

  • Request must contain a valid bearer token in the Authorization header
  • Token is validated against the Keycloak JWKS endpoint specified by ML_OR_KEYCLOAK_URL
  • Validation process includes:
    • Verifying token format before processing.
    • Extracting and validating the Key ID (kid) from the token header.
    • Checking whether the token issuer is known by the middleware.
    • Constructing the JWKS URL based on the issuer.
    • Verifying the token signature using the public key retrieved from the JWKS endpoint.
    • Validating the expiration of the token.
    • Validating whether the audience claims match.

Authorization:

  • When the request is authenticated, it injects the authenticated user context into the request state so that it can be accessed by the route handlers.
  • Route handlers can use decorators such as @allowed_roles(...) to enforce allowed roles, and @realm_accesible to check for realm access.

Security Features:

  • JWKS responses are cached for 30 seconds to reduce load on the Keycloak server
  • Pattern validation/URL validation to prevent security issues such as path traversal
  • Proper error handling with appropriate HTTP status codes

Configuration:

  • Routes can be excluded from authentication via configuration
  • SSL verification can be configured via an environment variable
  • Valid issuers can be configured as a static list or passed via a function that retrieves them dynamically

Front-end Integration via keycloak-js:

Changes made were fairly simple:

  • Prefers the SSO mechanism, redirecting to login if not authenticated
  • Context-aware token handling:
    • In embedded context (iframe): lets the parent window handle token refresh
    • In standalone context: handles token refresh internally
  • Full tenancy support through the realm query parameter (defaults to the master realm)
  • Additional change due to Keycloak integration: when no realm is specified in the route path, it navigates to the model configs for the authenticated realm

Other changes

  • Requests necessary for UX purposes directly call the OpenRemote manager API via the manager.api.rest functionality provided by or-core
  • OpenRemoteClient now takes in a realm property to default to rather than defaulting to master always.
  • Changes to avoid hitting the datapoint request limit: requests are split up into multiple chunks if possible, and the training data set is now specified via an ISO8601 string, resulting in a moving training period
  • Removed async from endpoints as is recommended by FastAPI when the underlying calls are all synchronous, this allows requests to be parallelized rather than running in the async eventloop.

For testing, I am currently disabling the Keycloak middleware conditionally, as there is no easy way of mocking middlewares in FastAPI and I didn't want to make this too complex.

For testing the middleware, I focused on pattern validation and things that were simple to mock without overly complicating it.


Visualization of the Keycloak middleware token validation
Untitled-2025-01-28-1455(1)

The middleware does not inheritly perform any authorization checks. The route handler is responsible for performaning any authorization checks e.g. (is realm accessible by user, does user have appropriate resource roles). These checks can be easily triggered by the decorators provided by the keycloak middleware (@allowed_roles(...), @realm_accessible)

Copy link
Member

@richturner richturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Looks good, maybe more inline with how JAX-RS works in java where any endpoint that doesn't have RolesAllowed annotation is implicitly unsecured so that might remove the need for the exclusion list?

I'd be cautious of writing your own OIDC/OAuth2 middleware as it is such a sensitive piece of code and easy to overlook subtle issues, battle tested well maintained implementations would be preferrable with possible wrapper to support multi-tenancy. Would something like pyoidc be an option?

@dominiquekleeven
Copy link
Collaborator Author

dominiquekleeven commented May 14, 2025

Nice work! Looks good, maybe more inline with how JAX-RS works in java where any endpoint that doesn't have RolesAllowed annotation is implicitly unsecured so that might remove the need for the exclusion list?

I'd be cautious of writing your own OIDC/OAuth2 middleware as it is such a sensitive piece of code and easy to overlook subtle issues, battle tested well maintained implementations would be preferrable with possible wrapper to support multi-tenancy. Would something like pyoidc be an option?

I'll check whether its straight forward to do something similar to JAX-RS, currently this implementation is based of: fastapi-keycloak-middleware which also does a list of excluded paths. It might make sense to do this as part of #32 in which I wanted to add more granular control over permissions/roles. But requires a bit more research/discussion on how we want to handle service permissions/roles.


I took a look at pyoidc before for token verification, but seems overly complicated for what we want to achieve here. it’s a full OpenID Connect implementation, whereas we only need to make sure we can decode using the public key, which also validates details such as the issuer and audience.

The actual validation here is done in this section:
https://github.com/openremote/service-ml-forecast/blob/6c122de4c8db4b6670c5ba40420751de1166eb65/src/service_ml_forecast/middlewares/keycloak_middleware.py#L215C1-L221C10 the JWT decoding is done via PyJWT.

If the token cannot be decoded/verified using the public key it will throw an exception. It doesn't deal with sessions, token creation or such, just validating using the public key from the configured keycloak instance.

Most of the code here is just handling the expected exceptions thrown by PyJWT and some basic pattern matching to extract the token and url validation.

@dominiquekleeven
Copy link
Collaborator Author

dominiquekleeven commented May 14, 2025

I cleaned up the middleware code a bit and removed some excessive pre-checking, these would've been caught by the decode call anyways.

It now also constructs a list of valid issuers based on the enabled realms retrieved from the OpenRemote Manager API. This removed the need for a lot of URL validation/pattern checking that I did to ensure the issuer was a valid and safe URL. Also generally considered a best practice to construct a list of 'allowed' issuers.

@dominiquekleeven
Copy link
Collaborator Author

Also updated test3 with this branch for the ML Forecast Service that it is currently running.

https://test3.openremote.app/manager/#/services (Embedded UI)

richturner
richturner previously approved these changes Jun 5, 2025
@dominiquekleeven
Copy link
Collaborator Author

dominiquekleeven commented Jul 28, 2025

@richturner

  • Is it only f backend service user is a super admin then the service supports multitenancy?

I haven't exactly tested what happens if the configured service user isn't a super admin, I'll try this out and report back. It might need some changes to accomodate that.

It should technically still function, but I am unsure if there are any endpoints that I used that require the user to be a super admin.

* For frontend it's ot exactly clear to me how the keycloak instance gets the token from the outer app and how it gets the refreshed tokens, is it getting them from cookies?

I am not entirely sure, from what I can tell it is cookie based. So its safe to assume they have to share a domain for the SSO mechanism to work properly. keycloak-js does a lot things for you in this scenario.

…an in the async event loop

It is because the underlying calls are synchronous by nature, in FastAPI endpoints marked with async are always ran in the eventloop even when blocking, when non async it attempts to coalasce and run requests in threads
@dominiquekleeven
Copy link
Collaborator Author

I made some changes to the OpenRemoteClient last week to allow specifying the default realm (it used to default to 'master'). You can now use a service user that isn't a super admin, making it more flexible and also allowing single tentant use.

@dominiquekleeven
Copy link
Collaborator Author

@richturner I've addressed your review comment + we've discussed the roles/permission topic on Monday.

I've re-requested a review, and hope that its be ready to get merged to main :)

@richturner richturner merged commit dd9f984 into main Sep 5, 2025
2 checks passed
@richturner richturner deleted the feature/keycloak-middleware-clean branch September 5, 2025 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Keycloak Integration

4 participants