Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Sec-Site should capture information about the requester of a resource #700

Closed
arturjanc opened this issue Apr 14, 2018 · 20 comments
Labels
addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan

Comments

@arturjanc
Copy link

This is a follow-up to #687, specifically: #687 (comment)

Background

To protect against cross-origin information leaks, exploitable by embedding a victim application's resources in an attacker-controlled document and inferring information about their contents, it would be useful to have a universal way to provide the application with information about the sender of the request with a request header. This would enable server-side decisions about whether to respond with sensitive resources depending on the context in which they are embedded, and could provide a robust way to protect against attacks such as CSRF, XS-Search, cross-origin timings and Spectre-like bugs.

This is conceptually similar to the Origin header, but would need to be present on all resource requests (possibly only to origins which opt in) -- otherwise an attacker could embed the resource in a way which doesn't cause the header to be sent, without giving the server the opportunity to reject the request. Servers which want to add protection against cross-origin information leaks would inspect the value of the Sec-Site header and could reject requests from origins they do not trust.

Details

From an adopter's point of view, the most useful variant would be identifying the source of the request with its origin, e.g.

Sec-Site: foo.example.org

In #687 (comment) @annevk mentioned some sensible concerns about sending the full origin value that I hope he can elaborate on here :) I expect that we can (and should!) make this feature fully compatible with the Referrer Policy by doing the following:

  • If the request includes a Referer header that contains the origin, send the exact origin value.
  • If the request wouldn't otherwise have any identifying information about its origin, send one of same-origin, same-site or cross-site (spelling TBD).

This way most developers could write code such as:

if not request.headers.get("Sec-Site") in ["foo.example.org", "same-site"]: # reject

In applications which don't have a Referrer Policy of no-referrer this would be even simpler because the developer could just rely on a URL whitelist.

I'd be a little wary of going with just the coarse-grained values because many applications have resources which are legitimately requested cross-origin, which wouldn't allow the developer to protect them. WDYT @annevk, would that be reasonable?

Questions

This is obviously just a sketch of the idea; if we want to pursue it more, we should probably think about the following issues:

  • Should it be a per-origin opt-in (via Origin Policy) or enabled by default?
  • Does it need a new header, or is there a benefit to overloading Origin instead (FWIW my guess is that would be ugly.)
  • Can we include some more metadata about the request, for example:
    • Whether this request is a navigation, rather than a subresource load.
    • The type of the resource that the request is going to be used as, e.g. "image" vs. "script"
  • Naming: particularly if we add metadata, the header would no longer just identify the requesting site so we might need to rephrase it.

@mikewest certainly has some thoughts about this as well.

@annevk annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest labels Apr 15, 2018
@annevk
Copy link
Member

annevk commented Apr 15, 2018

Making it compatible with Referrer Policy could work, but I think that would disadvantage UAs who want a non-default policy there. E.g., if in the future they decide they should reveal less, it's highly likely they'd break sites that come to rely on this header returning a certain value.

@arturjanc
Copy link
Author

That's a good point; though I think this would only happen if a UA adopted a default RP of no-referrer, which would likely break other existing things (any other default RP should be okay here).

The way to tackle it probably depends on whether we think it would be the responsibility of the UA or site authors to prevent applications using Sec-Site from breaking:

  • If we think the no-referrer default is the right model in the long term, we could ask developers with resources loaded cross-origin to configure requesting origins with Referrer-Policy: strict-origin or directly add the referrerpolicy attribute to any authenticated resource requests in JS libraries. Then, whenever UAs stop sending the Referrer, it would be a no-op for applications which use Sec-Site and have a custom RP.
  • If the no-referrer default is more of an optional feature that will not be broadly deployed by UAs -- and hence we can't get developers to make their sites compatible with it -- we could prevent breakage by having the UA not send the Sec-Site header in such cases. For example, in a hypothetical private browsing mode which removes the Referrer completely, the UA could perhaps opt out of the security protections of Sec-Site under the (somewhat optimistic) assumption that users are less likely to have long-lived authenticated sessions which benefit from cross-site infoleak protections.

I don't really like the explicit security vs. privacy tradeoff of the second option, but depending on how we see the Referrer-disabling mode being used, maybe it could be worth considering.

@mikewest
Copy link
Member

I think this is a good direction to explore, allowing a developer to make granular decisions about ACLs for particular resources/requests based on its initiator. I think I agree with the advantages @arturjanc posits, and I also agree with his analysis of the Referrer Policy integration. I'd go further than @arturjanc, actually, and suggest that same-origin/same-site/cross-site is coarse-enough that we'd be able to justify sending it even in the presence of no-referrer, which would mitigate the risk of "breaking" sites that relied on a requester revealing something about itself in order to service a given request.

Before diving too deeply into details (and I'd add redirect behavior to the list of @arturjanc's questions), I'd be curious about the rest of @whatwg/security's opinion on the proposal more generally. It seems pretty reasonable to me, and worth the time to sketch out.

@annevk
Copy link
Member

annevk commented Apr 16, 2018

@mikewest I think it would only mitigate the risk if those were the only values transmitted.

@mikewest
Copy link
Member

I think it would only mitigate the risk if those were the only values transmitted.

Well, let's start with that as a baseline: could we agree that sending the three-value enum would be fine?

I believe there's some real value in more granularity above and beyond that enum for services that wish to expose data to some subset of cross-origin entities, but not all cross-origin entities (for example: mail.google.com might trust accounts.google.com, but not docs.google.com; google.de might trust accounts.google.com, but not evil.com) Neither same-site nor cross-site would be granular enough to create those ACLs).

Perhaps we could send both? That is, we might send Sec-Site: same-site, https://docs.google.com and Sec-Site: cross-site, https://evil.com? Developers could be encouraged to check the low-granularity bit that they know will always be present, and look to the origin when included to increase the check's robustness?

(As an aside: is this a practical concern, or a theoretical concern? That is, is Mozilla pondering killing referer (or revisiting @briansmith's https://briansmith.org/referrer-01)? That would be interesting!)

@annevk
Copy link
Member

annevk commented Apr 16, 2018

Theoretical at this point, but it seems good to allow for it. Exposing both bits (perhaps space-separated to make it different from multiple headers) seems like an interesting approach. That at least doesn't result in unexpected values on the server side. A cross-origin redirect or sandbox would make the second bit null.

@mikewest
Copy link
Member

Including both values (separated by whatever characters you like!) seems valuable, independent of the referrer policy concerns, as same-origin, etc. is likely to be simpler for developers running in development environments to deal with. Simpler than hard-coding both development server names and prod server names, in any event.

@johnwilander
Copy link

I believe there's some real value in more granularity above and beyond that enum for services that wish to expose data to some subset of cross-origin entities, but not all cross-origin entities (for example: mail.google.com might trust accounts.google.com, but not docs.google.com; google.de might trust accounts.google.com, but not evil.com) Neither same-site nor cross-site would be granular enough to create those ACLs).

I thought we agreed that same-site would mean same eTLD+1 for these purposes. Not true? I pointed this out regarding SameSite cookies in #687 (comment) but then the thread seemed to go on to say that same-site should be same eTLD+1.

@johnwilander
Copy link

Should it be a per-origin opt-in (via Origin Policy) or enabled by default?

It looks like Origin Policy is stateful cross-site. True? From the spec:
"The Sec-Origin-Policy HTTP request header field is sent with navigational HTTP requests in order to advertise support generally for the origin policy manifest mechanism defined in this document, and to inform the server which version of its origin policy is cached locally."

If so, it won't fly for us for anti tracking reasons. Maybe partitioning would make sense. Were there any thoughts on that?

@annevk
Copy link
Member

annevk commented Apr 16, 2018

@johnwilander same-site does mean eTLD+1, but cross-origin != cross-site. What @mikewest was trying to demonstrate is that sometimes having the actual origin is useful to make decisions, even if it's same-site or cross-site. As for Origin Policy, I think folks had thoughts on removing the statefullness somehow, but no progress has been made recently. The draft as it stands today is known not to work for Safari.

@johnwilander
Copy link

johnwilander commented Apr 16, 2018

@johnwilander same-site does mean eTLD+1, but cross-origin != cross-site. What @mikewest was trying to demonstrate is that sometimes having the actual origin is useful to make decisions, even if it's same-site or cross-site.

Ah, got it. I misread his comment. He does imply that same-site is eTLD+1.

As for Origin Policy, I think folks had thoughts on removing the statefullness somehow, but no progress has been made recently. The draft as it stands today is known not to work for Safari.

OK. Thx.

@arturjanc
Copy link
Author

It's not hugely important, but if we can make it work without gating on Origin Policy I would also prefer that. I mentioned this possibility because an origin-level opt-in would address past concerns about request size (#687 (comment)), but it would be easier for developers if we could send this information by default.

@dveditz
Copy link
Member

dveditz commented Apr 16, 2018

Is there any room in this proposal for including the type of request (corresponding to the "AS script" etc in other specs). If you've got a document URL with params and it's being requested as an IMG then it's probably an attack of some kind.

@arturjanc
Copy link
Author

Yes, this would certainly be useful -- @mikewest also mentioned this and I attempted to capture something similar as one of the open questions above. I imagine that developers could refactor application code which sets the Content-Type of responses to inspect the request type and make sure they match (possibly starting by logging unexpected values to detect accidental mismatches).

@mikewest
Copy link
Member

Is there any room in this proposal for including the type of request (corresponding to the "AS script" etc in other specs).

I know I've talked to @arturjanc about this, and I do support it. I don't think I've written that down anywhere, though, so, there you are. :) Encoding the initiator and destination of the request in a way the server can access would be really interesting, and I can see real use cases for it from a security perspective.

I think origin manifests are a bit off topic, but:

As for Origin Policy, I think folks had thoughts on removing the statefullness somehow, but no progress has been made recently. The draft as it stands today is known not to work for Safari.

I don't think there's any tweaking around the edges that we can do to make origin manifests not represent state in third-party contexts. Regardless of explicit advertisement of the manifest version in HTTP request headers, the mechanism will certainly support some features that will create web-visible state for a given origin: that's the whole point of the feature. :) As a silly example, consider a manifest that sets a script-src https://1.example.com as a baseline for an origin, and a page that attempts to load https://1.example.com/js and https://2.example.com/js. If a user wishes to separate their first-/third-party state, browsers will need to separate the origin manifests as well.

if we can make it work without gating on Origin Policy I would also prefer that.

I don't see this as at all related to origin manifests, except insofar as origin manifests might be a reasonable configuration mechanism if we decide that this should be opt-in. I'm not sure the size overhead is enough to care about, but it's a debate worth having.

@arturjanc
Copy link
Author

One valid question that came up recently in a related discussion is why developers cannot use the Referer header as source of information about the requesting origin, as opposed to the origin being provided in the Sec-Site header.

The main reason for this is that developers cannot count on the Referer always being present in legitimate requests: it is stripped on HTTPS->HTTP transitions, on navigations performed in a new window (about:blank + navigation via JS -- used intentionally by many applications to protect users from URLs leaking to third parties), etc. This means that in order to not break existing users applications must be willing to accept requests without a Referer; as a result, an attacker could remove the referrer from their requests and the server would have to process them as usual, removing the protections we're hoping to get from Sec-Site.

(Some applications could certainly do without the origin at all, and reject all cross-origin requests for authenticated resources, but this runs into the problem of insufficient granularity that Mike mentioned in #700 (comment))

@annevk annevk added needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan and removed needs implementer interest Moving the issue forward requires implementers to express interest labels Apr 19, 2018
@arturjanc
Copy link
Author

Yesterday at WebAppSec we had a chance to briefly talk about the aspect of this proposal which would expose the actual origin of the sender of a resource request, as opposed to a more coarse-grained value (same-site, etc). I'll attempt to summarize @TanviHacks's concerns here because it's a sticking point that's worth discussing; if we can't agree, I think we may just want to go with the coarse-grained values for now. But I hope we can agree! :)

The main concern is that we'd be building a "new Referrer" which would reveal the originator of the request, but in a new place. Given that the Referer header often carries sensitive information, there is an ecosystem of tools and features that attempt to protect users from Referrer leakage. For example, Firefox has fine-grained Referrer configuration options and there are extensions which strip the header. If we now expose similar information elsewhere, then such tools would either be less effective or they will have to be modified to also handle the new header -- in which case they might strip the origin information from Sec-Site similarly to how they remove the Referer, so sites that rely on this value to allow cross-origin requests from trusted origins would break. @TanviHacks, is this a reasonable summary?

(Narrowly, on the last point about possible site breakage: I think we could handle this case by making sure to remove the entire Sec-Site header instead of just stripping the origin value, in which case the server would respond to the request as usual, as outlined in #700 (comment))

When it comes to the general worry about disclosing the requester's origin in Sec-Site, I think this may be qualitatively different from the issues we have with Referer leakage, for a non-obvious reason. Basically, Referer is sent on both navigations and resource requests; I expect the bulk of the privacy concerns are related to leaking its value on cross-origin navigations, rather than when fetching resources, which is the case that Sec-Site focuses on. My guess is that preventing disclosing information about the document origin on resource requests, which we care about here, is not the primary motivation of existing Referer protections.

The main reason I think so is that for cross-origin requests in "cors" mode we're already exposing the requesting origin to allow the server to make authorization decisions -- AFAIK the Origin header isn't stripped by most Referer protections and their users would not consider this to be a failing of such tools. Another thing to keep in mind is that in many cases resources are implicitly trusted by applications that load them and usually get much more power than being able to know the requesting origin, e.g. scripts and stylesheets get almost free rein over the DOM. There are certainly cases where developers don't fully trust the resources they embed and do not want to disclose their origin on such requests, but they can do so via Referrer Policy, which Sec-Site would respect.

So, from the application developer's point of view, Sec-Site would not send any more information than what is currently present in resource requests. From the point of view of a privacy-conscious user of Referrer-stripping software, the type of information included in Sec-Site (i.e. the origin) is already present in CORS requests and is fairly coarse-grained. From the point of view of a developer of privacy tools (e.g. the hypothetical Referer-less private browsing mode), any potential site breakage can be avoided by removing the header completely, since servers will have to accept all requests without it for compatibility with older browsers.

If, despite the things mentioned above, exposing the requesting origin in Sec-Site is still a concern, perhaps it may be workable (though harder for developers) to do it on an opt-in basis. For example:

  • We could send it only on requests originating from pages which set a Referrer Policy, treating it as an explicit signal that the developer has thought about what is appropriate to expose in the Referer (and if the RP is no-referrer, omit the origin). Developers with resources requested across origins would first add RP to make sure that their cross-origin requests all come from cooperating sites.
  • We could build a completely separate switch which developers would use to the same effect as above. It could be more configurable, e.g. it could only send the origin if the destination matches a whitelist, but it would likely be substantially more work to implement and adopt.

The reason I'm pushing on this aspect (sorry for being a pest!) is that having reliable origin information in resource requests, particularly by default, would make it much easier to deploy protections from cross-origin leaks in any application that isn't fully self-contained in a single origin, or has endpoints that may legitimately be requested cross-origin. The application could first gather origin information from existing requests, build a whitelist of origins which currently load data from it, and then lock down processing of sensitive responses to requests from the whitelisted origins. If a significant number of requests only carried the cross-origin designation without any identifying information, it would be difficult for developers of this class of applications to enforce meaningful restrictions on cross-origin resource loads, which is the main impetus behind this change, and one of the more promising approaches to tackle Spectre and other information leaks.

@bzbarsky
Copy link

Just one note: subframe loads do not have a "cors" mode right now, and are conceptually a lot more like navigations (I mean, they are navigations) than like subresource requests in a lot of ways. Though they are subresource requests in other ways... In any case, this part would need some thinking about.

@mikewest
Copy link
Member

After talking with @arturjanc and a few other folks at Google, I've tried to condense this discussion down into a short explainer that punts on the origin question (we'll do some research on the side and come back to it), and adds @dveditz's initiator/destination suggestion, as well as other random bits and pieces that Google's security team could use to address various forms of cross-site leakage. I'd appreciate feedback on https://github.com/mikewest/sec-metadata.

@annevk
Copy link
Member

annevk commented Jan 30, 2021

Let's fold this into #885.

@annevk annevk closed this as completed Jan 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan
Development

No branches or pull requests

6 participants