Skip to content

Add redirections and URL rewrite features #1161

Open
@Wonshtrum

Description

@Wonshtrum

Currently, Sozu cannot rewrite URLs, and its only redirection capabilities are generating 401 Unauthorized on routes detached from a cluster and 301 Permanently Moved to redirect the whole HTTP traffic of a cluster to HTTPS.
I think redirection and URL rewriting should be orthogonal features that can be composed. The rewriting of a URL to pass it to the origin or redirect to it, should be expressed in the same way. Additionally, a redirection should be expressed in the same way whether the URL is rewritten or not.

URL Rewriting

Sozu already has "variable" domain and path matching. Domain can be:

  • an exact hostname (e.g. foo.com)
  • a wildcard domain (e.g. *.foo.com)
  • a collection of regex where each regex has to span an entire subdomain (e.g. /cdn[0-9]/.foo./.*/.com)

And a path can be:

  • an exact path (e.g. /api)
  • a prefix (e.g. /client/)
  • a regex (e.g. /client/id_[0-9]*/.*)

How to reuse this system for rewriting

When a URL matches a frontend, it would collect capture groups for the domain and the path separately. Regardless of the type of domain and path, the first group in each capture list is the complete domain and path respectively (akin to regex implicit group 0). The following groups would depend on the specific type:

  • wildcard URLs create a second group for the matched last subdomain
  • regex URLs append their capture groups in order (the URL are treated as a single regex, not a collection)
  • prefix paths create a second group for the matched suffix
  • regex paths append their capture groups

From those group, a new URL could be written using a simple template system with two variable arrays HOST and PATH:

https://client_$PATH[1].bar.$HOST[2].com/$PATH[2]?cdn=$HOST[1]

Redirection

Like URL rewriting I think redirection should be expressed at the frontend level. At the very least the redirection system should be able to mark a frontend as "Permanently Moved", generating a 301 unconditionally. "Temporary Moved" (generating a 302) could be added. 401 Unauthorized could also be specified on the frontends of named clusters. Expressing this is quite trivial, what is less is conditional redirection. I don't like the idea of creating a full DSL to allow the expression of complex routing decision-making. But it should be possible to at least express the need to redirect to HTTPS if the request is HTTP (since Sozu has already this feature).
For forwarded rewritten URLs, the X-Forwarded-Port header is already added. The X-Forwarded-Host header could also be added with the original hostname. Should this behavior be optional? Should it override an already existing X-Forwarded-Host?

Composition

In addition to hostname and path rewrite, method, scheme, and port could also be theoretically rewritten. I don't like the idea of rewriting the method, I don't see a good usage for forwarded requests, and it can't be expressed in a redirection. Rewriting the scheme is useful for redirecting HTTP requests to HTTPS, but only makes sense for redirections, forwarded requests are always in HTTP. Finally rewriting port could also be useful for redirections, and can be expressed for forwarded requests as well (even if I don't see a good use case for this).

Proposal

I propose to add to frontends the following options:

  • redirect: FORWARD, PERMANENT, TEMPORARY, FORCE_HTTPS (default to FORWARD)
  • redirect_scheme: USE_SAME, USE_HTTP, USE_HTTPS (only valid if redirect is PERMANENT or TEMPORARY, default to USE_SAME)
  • rewrite_host: Option<String>
  • rewrite_path: Option<String>
  • rewrite_port: Option<u16>

The URL rewriting is split into host, path, and port. This enforces the fact that the scheme cannot be rewritten directly. The scheme can be rewritten only for redirections, conditionally using the FORCE_HTTPS, or unconditionally using redirect_sheme.
The only conditional redirection is FORCE_HTTPS.

Extension

Sozu may gain shortly authentication capabilities. I expect this feature to be orthogonal to rewriting and mutually exclusive with redirection. Failing authentication would return a 403 (and failing to provide authentication would return a 401). Rewriting would only occur on successful authentication and forwarded to the origin.

Limitations

Conditional redirection is limited by design, but It may be something we want to develop?
Sozu doesn't distinguish between path and query parameters, if a user wishes to rewrite/add/remove one, regexes might not be powerful enough to do so (the regex crate we use doesn't implement look-around and backreferences).
Matching, collecting groups, and rewriting the URLs may slow down the frontend lookup especially if regexes are overused. I don't think this is avoidable, but to mitigate this the matching will certainly only use regex find (which is faster than full capture) and perform a full capture on the single matching frontend and only if this frontend requires rewriting. Additionally, the regex crate we use ensures that the maximum complexity of find/capture is O(MxN) (with M the length of the haystack and N the length of the regex).

Example

[clusters.MyCluster]
protocol = "http"
frontends = [
    {
        address = "0.0.0.0:8080",
        hostname = "/(cdn[0-9])/.foo./(.*)/.com",
        path = "/client/id_([0-9]*)/(.*)",
        path_type = "REGEX",
        redirect = "PERMANENT",
        redirect_scheme = "USE_HTTPS",
        rewrite_host = "client_$PATH[1].bar.$HOST[2].com",
        rewrite_path = "/$PATH[2]?cdn=$HOST[1]",
        rewrite_port = 8443,
    }
]
$ curl -v http://cdn03.foo.baz.com:8080/client/id_42/profile.jpg
< HTTP/1.1 301 Moved Permanently
< Location: https://client_42.bar.baz.com:8443/profile.jpg?cdn=03
< Connection: close
< Content-Length: 0
< Sozu-Id: 01JETWM78KFAYZS5V9JANA0FNB

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions