Skip to content

Update handling of user-agent data inferral #4641

Open
@mydea

Description

@mydea

Related to getsentry/sentry-docs#13205

We infer the following data from the sent user-agent in relay (as far as I can tell):

  • Browser Context
  • Browser Tags
  • Device Context
  • Device Tags
  • (Client) OS Context

As of now, the user-agent is always sent in the event.request object, like this:

"request": {
  "url": "https://docs.sentry.io/product/issues/issue-details/performance-issues/n-one-queries/",
  "headers": [
    [
      "Referer",
      "https://www.google.com/"
    ],
    [
      "User-Agent",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
    ]
  ]
},

While this makes sense for backend/API events, where the request object contains information about the incoming request - there, the user-agent is actually a header, etc - this is not really correct for browser events. In the case of errors happening in a browser SDK, there is not really a user-agent header. We are really mis-using the event.request object in this case to carry the user agent, so we can infer the browser/os/device data that we indeed also want to have for browser SDKs.

For context, sending this via event.request is especially weird because this is then shown in the Sentry UI as follows:

Image

which is really misleading for a browser error, it only really makes sense for backend environments.

Because of this, we want to stop sending the user agent in the request context here, and instead send it in a different way that makes more sense for a browser SDK.

(Side note: Other things we use from event.request right now will also need to be moved, see getsentry/sentry-docs#13203, but this is a bit easier to reason about)

For user-agent specifically, we thus propose to allow to send this in the browser context:

{
  "contexts": {
    "browser": { "user_agent": "Mozilla/....." }
  }
}

If this is present, and no other data exists in the sent browser context, then this should be prioritized as source for user-agent data inferral:

  1. If contexts.browser.user_agent exists, use this
  2. Else, if request.headers['user-agent'] exists, use this

browser context data inferred in relay will be merged with potentially existing browser context sent in the event, where the inferred data has lower priority in merging.

{
  "contexts": {
    "browser": { 
      "user_agent": "Mozilla/.....",
      "name": "Chrome",
      "version": "101.0.0"
     }
  }
}

If (theoretically) a name or version already exist in the browser context, they should not be overwritten.

Why not just use the user agent from the HTTP request?

While this would work in most cases, it does not work with tunneled requests (they may loose the user-agent if not specifically forwarded by the tunnel implementation, which we do not suggest users to do as of now).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions