Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security bootstrapping during discovery #135

Closed
benfrancis opened this issue Mar 12, 2021 · 19 comments · Fixed by #313
Closed

Security bootstrapping during discovery #135

benfrancis opened this issue Mar 12, 2021 · 19 comments · Fixed by #313

Comments

@benfrancis
Copy link
Member

Problem

There are a couple of sentences in the WoT Discovery spec which talk about only allowing authenticated entities to access TDs and TDDs.

WoT Discovery has to allow authenticated and authorized entities (and only those entities) to find WoT Thing Descriptions satisfying a set of criteria, such as being near a certain location, or having certain semantics, or containing certain interactions.

The server SHOULD serve the requests after performing necessary authentication and authorization.

Currently security metadata is provided inside a Thing Description (for both things and directories), which causes a paradox:

  1. A consumer needs to parse the security medatadata in the Thing Description in order to know how to authenticate with the producer
  2. A consumer can not access the security metadata in the Thing Description unless it is authenticated.

This is a problem with Thing Descriptions in general which is highlighted particularly well by the Directory Service API.

Solution

I propose a two step process by which a consumer can authenticate with the producer of a Thing Description (for both things and directories):

  1. The first time a client fetches a Thing Description from the server, the server responds with a partial TD which contains only security metadata (and possibly the thing ID)
  2. The client then performs the authentication steps necessary for (one of) the security scheme(s) described in the partial TD, before re-fetching the TD URL. Providing the authenticated user is authorised to access the TD, the server responds with the full TD.

First fetch:

{
    "@context": "https://www.w3.org/2019/wot/td/v1",
    "id": "urn:dev:ops:32473-WoTLamp-1234",
    "securityDefinitions": {
        "basic_sc": {"scheme": "basic", "in":"header"}
    },
    "security": ["basic_sc"],
}

Second fetch, with basic authentication provided in header:

{
    "@context": "https://www.w3.org/2019/wot/td/v1",
    "id": "urn:dev:ops:32473-WoTLamp-1234",
    "title": "MyLampThing",
    "securityDefinitions": {
        "basic_sc": {"scheme": "basic", "in":"header"}
    },
    "security": ["basic_sc"],
    "properties": {
        "status" : {
            "type": "string",
            "forms": [{"href": "https://mylamp.example.com/status"}]
        }
    },
    "actions": {
        "toggle" : {
            "forms": [{"href": "https://mylamp.example.com/toggle"}]
        }
    },
    "events":{
        "overheating":{
            "data": {"type": "string"},
            "forms": [{
                "href": "https://mylamp.example.com/oh",
                "subprotocol": "longpoll"
            }]
        }
    }
}
@farshidtz
Copy link
Member

farshidtz commented Mar 12, 2021 via email

@benfrancis
Copy link
Member Author

@farshidtz The problem with that approach is that a Thing Description isn't just an API description, it may contain private (meta)data which should not be made public.

This isn't just a problem for Thing Descriptions of a directory, it is a problem for all Thing Descriptions. Making all Thing Descriptions public risks revealing a great deal of sensitive information, for example in the context of the smart home. Providing public metadata about all the devices in a smart home, potentially including model numbers and even locations within the home, is like a shopping list for a burglar or cyber criminal and may reveal a great deal of information about the occupier of the building.

On the WebThings project we would like to expose a Thing Description for WebThings Gateway which exposes a directory but also exposes other functions like power cycling the gateway and putting its radios in pairing mode. Even the title and description of the gateway may reveal information the home owner may prefer to keep private.

@mmccool
Copy link
Contributor

mmccool commented Mar 16, 2021

You raise a good point. We probably need to define some special cases.

For directories, for example, we could say in advance they always use OAuth2 (for example). Then once you have access to the directory, you can fetch the TDs it contains with that authentication, and there is no paradox. (If we want more flexibility than that, we can do the same as what I propose below for self-description, which in summary is to limit security mechanisms to those in which the protocol itself provides enough information on what security is expected)

The tricky case is self-description. We could define a default security mechanism just for accessing the TD on a Thing, then the TD itself could define other mechanisms for accessing other resources. Unfortunately it's less clear in this case what the default should be. If there was one clear default that worked in all cases we would not need the security data in the TDs. However, we can cut down the options a lot since a lot of the weirder/more complicated options in there are to support brownfield devices or custom mechanisms thought up by CSPs or device manufacturers. Brownfield devices would not be supporting self-description (they would be described "on the outside", by other agents).

Note that in some sense though the security metadata is redundant and is there mostly as documentation about what a developer can expect when they see a live Thing. This is because all the protocols that I know of also indicate what security data is needed. When you do a GET in HTTP on a resource protected by a password (for instance), the server will respond with a request for the password as part of the protocol.

However, we should at LEAST limit the options to those which the protocol provides the necessary information. For example, HTTP password, OAuth2, and so forth. We should disallow things like "body" and "apikey" since the protocol itself does not provide enough information about these. I'm also thinking mostly HTTP for self-description, and maybe we could limit self-description to that protocol (although we might want to look at CoAP too, with a similar constraints; but I don't think we want to support self-description in MQTT, at least not in this round).

@mmccool
Copy link
Contributor

mmccool commented Mar 17, 2021

So my above can be summarized as "limit security schemes for fetching the TD in self-description to those that can be specified in the protocol". Specifically, this includes "basic", "digest", "bearer", "oauth", and actually "nosec".

For directories, one simple solution is to say the TD for the directory can always be accessed using "nosec". However, other resources might require additional, more stringent security, but this will be specified in the TD returned by the directory service.

@farshidtz
Copy link
Member

farshidtz commented Mar 17, 2021

It may be my problem, but I always assumed the security definitions inside a TD are for the interaction affordances, not for the TD itself. If the TD needs to protected, then the security definitions must be defined elsewhere, perhaps another TD if we must (similar to discovery's Link Description type), telling how to interact with the private TD.

These are from the TD spec:

Set of security definition names, chosen from those defined in securityDefinitions. These must all be satisfied for access to resources.

Security configuration in the TD is mandatory. At least one security definition MUST be activated through the security member at the Thing level (i.e., in the TD root object). This configuration can be seen as the default security mechanism required to interact with the Thing.

We are now claiming that the security definitions set in the TD are also about accessing the TD itself. This really makes no sense. Is it specified anywhere?

@mmccool
Copy link
Contributor

mmccool commented Mar 17, 2021

Nope, that's another problem, the security needed to fetch a TD is not specified in the TD itself. So we can't extract just a security scheme from the TD, since it isn't given. It's meta-metadata.

Anyhow, I added this to the agenda to discuss today... BTW this issue is titled with a proposed solution but the problem can be better termed "Security Bootstrapping", and I can think of three possible solutions (none of which are perfect...), including the two-phase approach, but also a fixed default and depending on protocol negotiation (my comments above).

@mmccool
Copy link
Contributor

mmccool commented Mar 17, 2021

We COULD use the default (top-level) security definition as the security needed to access the TD. That addresses one problem.

@benfrancis
Copy link
Member Author

benfrancis commented Mar 17, 2021

@mmccool wrote:

For directories, for example, we could say in advance they always use OAuth2 (for example). Then once you have access to the directory, you can fetch the TDs it contains with that authentication, and there is no paradox.

Surely just knowing that a directory uses OAuth2 is not enough information on its own. How would a client find the token server and authorization server for example?

The tricky case is self-description.

I'm not sure why it makes any difference whether a TD is serving its own Thing Description. Fundamentally in HTTP there's a client (WoT consumer) and a server (WoT producer). Either the Thing Description is public or it requires authentication. If it requires authentication then the client needs some way to know how to authenticate.

@farshidtz wrote:

It may be my problem, but I always assumed the security definitions inside a TD are for the interaction affordances, not for the TD itself. If the TD needs to protected, then the security definitions must be defined elsewhere, perhaps another TD if we must (similar to discovery's Link Description type), telling how to interact with the private TD.

OK, that's similar to what I'm describing but with two separate TDs rather than two different views of the same TD depending on whether the client provides authentication credentials in the request.

These are from the TD spec:
...
We are now claiming that the security definitions set in the TD are also about accessing the TD itself. This really makes no sense. Is it specified anywhere?

OK, my reading of the draft discovery specification was that directories would be authenticated, but perhaps I misunderstood. I agree that for Thing Descriptions in general it is assumed that the TD is public, but that's the problem I'm trying to solve.

Let me provide a concrete example to help explain the issue we have with WebThings. Here is a hypothetical TD for the gateway (which acts as a directory, but also has other interaction affordances and semi-private metadata).

{
  "@context": [
      "http://www.w3.org/ns/td",
      "https://w3c.github.io/wot-discovery/context/discovery-context.jsonld",
      "https://webthings.io/schemas"
  ],
  "@type": "DirectoryDescription",
  "title": "Smith Family WebThings Gateway",
  "version": {
    "instance": "42",
    "softwareVersion": "WebThings Gateway 1.0",
    "hardwareVersion": "WebThings Hub 1.0",
  },
  "securityDefinitions": {
      "oauth2_code": {
      "scheme": "oauth2",
      "flow": "code",
      "authorization": "https://auth.example.com/authorization",
      "token": "https://auth.example.com/token",
      "scopes": [
          "write",
          "read",
          "search"
      ]
    },
  "security": "oauth2_code",
  "properties": {
    "things": {...}
  },
  "actions": {
    "createThing": {...},
    "updateThing": {...},
    "deleteThing": {...},
    "reboot": {...},
    "shutDown": {...},
    "pair": {...},
    "unpair": {...}
  },
  "events": {
    "thingCreated": {...},
    "thingUpdated": {...},
    "thingDeleted": {...}
  }
}

In the case of the gateway, the whole API is authenticated using JSON Web Tokens in the HTTP Authorization header (which are issued to the user once they enter a username and password) and supports OAuth2 for third party applications to authenticate with the API. The same token is used for both the REST API and WebSocket API and is valid for all devices the authenticated user has access to (accessing the Thing Descriptions of all the devices requires authentication using the same token).

We would ideally rather this top level Thing Description not be public because it may contain metadata which would reveal information about the household and the smart home hub they are using. Therefore it would be authenticated using the same JWT.

In order to authenticate with the gateway's web server, my understanding is that an API client needs to know the authorization and token server, but that information is locked up inside this top level TD.

@farshidtz So in order to make this top level TD authenticated, are you saying that there would need to be another TD which acts as a Link Description which only provides security metadata and a link to the gateway's TD?

Separately the WebThings Framework provides WoT servers (producers) for individual devices which broadcast their TD URLs using mDNS. Are you suggesting that if we didn't want their full TDs to be public that each device would need two TDs: A Link Description with just security metadata and a link to a separate Thing Description with the full metadata?

@benfrancis benfrancis changed the title Define a two stage authentication process Security bootstrapping during discovery Mar 17, 2021
@mjkoster
Copy link

Given that we consider it to be a bootstrapping issue for clients to get the authentication instructions, it sounds like the expected case is that a client will not know how to authenticate, and need to get some sort of instructions. If so, we may consider a normal TD fetch where there is a special TD that contains the authentication instructions, and the client is aware of this up front. I think we called this Ben's proposal. Why would it be better to use an error signal and require the client to have another method to obtain the authentication instructions, if it is the expected behavior? The introduction phase could offer this standalone authentication TD, not as an alias to the TD the client really wants, but a separate control document.

@relu91
Copy link
Member

relu91 commented Mar 17, 2021

I think this might be related to the fact that we don't really have a standard way to retrieve a TD. I was wondering if Discovery would try to have a description of how to that.

Describing error types and standard parameters could be the first step forward. Since if we're going to stick with the current approach that the "real" discovery should always start from an URL we need additional information to really retrieve it.

See also #90.

@benfrancis
Copy link
Member Author

benfrancis commented Mar 17, 2021

@mjkoster wrote:

Given that we consider it to be a bootstrapping issue for clients to get the authentication instructions, it sounds like the expected case is that a client will not know how to authenticate, and need to get some sort of instructions. If so, we may consider a normal TD fetch where there is a special TD that contains the authentication instructions, and the client is aware of this up front. I think we called this Ben's proposal. Why would it be better to use an error signal and require the client to have another method to obtain the authentication instructions, if it is the expected behavior?

I don't really mind whether the initial fetch responds with a success or error response, but 401 Unauthorized (which really means "unauthenticated") might be a good fit here as compared to 403 Forbidden.

401 is "similar to 403, but in this case, authentication is possible."

The introduction phase could offer this standalone authentication TD, not as an alias to the TD the client really wants, but a separate control document.

This is kind of what @farshidtz was suggesting with a Link Description, but it means that every device needs to serve two different resources instead of one, which really doesn't seem necessary when there's an obvious alternative.

Edit: Note that for some security schemes, it's even standard for a 401 Unauthorized response to include a WWW-Authenticate header which tells the client what method it should use to authenticate.

@benfrancis
Copy link
Member Author

benfrancis commented Sep 10, 2021

Has there been any progress on this?

  1. How should a discovered Web Thing or Directory respond if a Consumer tries to fetch a TD or TDD but isn't authenticated to access it?
  2. What should a Consumer do if its request to access to a TD or TDD receives such a response?

One solution might be:

  1. The Producer responds with a 401 Unauthorized response, with authentication instructions in a WWW-Authenticate header
  2. The Consumer then attempts to authenticate using the specified mechanism before re-fetching the TD/TDD

That's obviously HTTP specific, but that would seem appropriate at least for the HTTP binding provided for the Directory Service API.

@farshidtz
Copy link
Member

One solution might be:

  1. The Producer responds with a 401 Unauthorized response, with authentication instructions in a WWW-Authenticate header
  2. The Consumer then attempts to authenticate using the specified mechanism before re-fetching the TD/TDD

Recommending that for all HTTP interactions is a good idea.

This is actually the mandatory approach for OAuth2: https://datatracker.ietf.org/doc/html/rfc6750#section-3

@benfrancis
Copy link
Member Author

I've added the "Resolve prior to WD update" label here because it seems to me that the discovery process will be useless in many cases without this feature. It's certainly a blocker for implementing WoT Discovery for WebThings Gateway where all Thing Descriptions require authentication for privacy reasons.

Please feel free to remove the label if it's inappropriate.

@benfrancis
Copy link
Member Author

Digging into this a little deeper...

The WWW-Authenticate header is defined in section 4.1 of RFC7235 which says:

"A server generating a 401 Unauthorized response MUST send a WWW-Authenticate header field containing at least one challenge

The WoT Discovery specification lists 401 Unauthorized as an error response for both the self-description and directory exploration mechanisms, so presumably this header MUST be sent as part of those responses.

Section 5.1 references the IANA Hypertext Transfer Protocol (HTTP) Authentication Scheme Registry for a list of authentication schemes which can be used in the challenges sent in a WWW-Authenticate header. That registry includes a list which overlaps with but doesn't completely match the list of security schemes in the WoT Thing Description specification:

  • Basic
  • Bearer
  • Digest
  • HOBA
  • Mutual
  • Negotiate
  • OAuth
  • SCRAM-SHA-1
  • SCRAM-SHA-256
  • vapid

That's not necessarily a problem since this mechanism is for authenticating Thing Descriptions themselves rather than the interfaces they describe, however...

To use WebThings Gateway as an example, a request to access a Thing Description needs to have an Authorization header containing a JWT as a Bearer token. This token can be requested using OAuth2. The authentication is the same as that required for API endpoints, as described in the example Thing Description security definition below:

{
	"oauth2_sc": {
		"scheme": "oauth2",
		"flow": "code",
		"authorization": "https://plugfest.webthings.io/oauth/authorize",
		"token": "https://plugfest.webthings.io/oauth/token",
		"scopes": [
			"/things/virtual-things-10:readwrite",
			"/things/virtual-things-10",
			"/things:readwrite",
			"/things"
		]
	}
}

In implementing section 7.2.2.1.2 Retrieval of the WoT Discovery specification my understanding is therefore that an unauthenticated request should receive a 401 Unauthorized response which MUST contain a WWW-Authenticate header which describes this challenge. But it's not clear to me exactly how the above challenge should be described.

There is an OAuth authentication scheme in the IANA registry, but not an OAuth2 scheme. Does that matter? Or should I be using the Bearer scheme?

More generally, is the WWW-Authenticate header enough on its own to provide sufficient metadata for the security bootstrapping feature described here, or do we need to provide additional metadata in the body of a 401 response, following the Thing Description format as suggested in my original post above?

Either way I think additional guidance inside the WoT Discovery specification would be useful here.

Note that RFC7235 does also say that:

A server MAY generate a WWW-Authenticate header field in other response messages to indicate that supplying credentials (or different credentials) might affect the response.

I don't know whether that helps.

I'd really appreciate your input on this @farshidtz and @mmccool because I'm reaching the limits of my understanding of these security schemes, but without this feature WoT Discovery is largely useless for WebThings Gateway because a "Discoverer" will never be able to gain access to Thing Descriptions.

@benfrancis
Copy link
Member Author

Since #311, the latest draft of the specification says that a 401 response is "often accompanied by a WWW-Authenticate header", but it doesn't define what values can be expected by a client. Without further specification, that presumably implies that servers may use any present and future authentication scheme from the IANA Hypertext Transfer Protocol (HTTP) Authentication Scheme Registry, and therefore clients must support all present and future schemes.

If the goal is for any client to be able to consume any directory, then I suggest the specification probably needs to be more prescriptive and say that a 401 response must always contains a WWW-Authenticate header, and to constrain the values of that header to a finite list of authentication schemes.

That way it's clear what authentication schemes a directory server may use, and therefore which authentication schemes a client must support in order to access any directory.

This also assumes that the WWW-Authenticate header alone provides enough information for a client to authenticate with a directory without the need for any additional out-of-band information (except that provided as part of the authentication process like a user filling out a login form). Hopefully that is the case, but I don't know enough about each of the authentication schemes to be sure.

@mmccool
Copy link
Contributor

mmccool commented May 16, 2022

OK, time to resolve this. The current text is just a placeholder, but as I see it we have two choices:

  1. Use 401 + WWW-Authenticate. But then we have to decide what "schemes" to allow, and probably should limit ourselves the IANA-registered ones. As noted above, these overlap with the ones allowed in TDs but includes ones like HOBA, Mutual, and VAPID that we don't support in TDs, and there are schemes in TDs that don't map to specific WWW-Authenticate schemes.
  2. Use 401 + partial TD response (just the security scheme part of the TD). I suspect we can't do this, technically, and certainly it would not be understood by browsers. We could use a different error code, but my concern is that we would basically be defining a "security mechanism" for negotiation which we said we would not do.

So I think we should go with 1. Allowable schemes then would be Basic, Bearer, Digest, and OAuth (which I believe also covers OAuth2), and "Nosec" (only in the case of no PII; in this case there would be no 401 error). I looked into "Mutual", covered by RFC8120, but it's marked as "experimental", and doesn't really map onto any TD scheme. Basically though it is a fancy password scheme. HOBA (RFC7486) looks like a means to implement PSK but is also marked "experimental", and may only work on systems with public URLs. The IANA entry for "Negotiate" says it violates some principles of HTTP authentication (which I think option 2 above also violates).

My inclination is to use a 401 response with one of Basic, Bearer, Digest, or OAuth to perform security bootstrapping. Note that these all also pretty much require TLS to be effective. The S&P considerations only have a "SHOULD" for TLS on local networks in particular but also say that TDs should generally be considered PII and so nosec is disallowed. This means the spec might require a password for directory access even on a LAN without TLS.

PS: There is some general confusion around the terms "authentication" and "authorization". So HTTP sends back an "WWW-Authentication" header but expects the response to have an "Authorization" header. Oh, well... Also, TLS also does "authentication" (and even mutual authentication if set up that way...) but browsers don't generally support mTLS (due to it being a privacy risk) and there's the LAN issue. So I don't think we can depend on mTLS alone to control access.

@mmccool
Copy link
Contributor

mmccool commented May 16, 2022

I also agree with Ben that we should REQUIRE the response to have a WWW-Authenticate header, as well as limiting the allowable schemes to the ones mentioned. Anyway, let me draft a PR, then we can discuss this all in the content of a concrete proposal...

@mmccool
Copy link
Contributor

mmccool commented May 16, 2022

PR created: #313. Please review. I will keep it in draft status until I see reviews from people who have actively contributed to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants