Skip to content

Workflow 2.0 #34327

Closed
Closed
@dcramer

Description

@dcramer

Our core workflow has begun to suffer over the years - from growth of technology, scale of data, and general lack of attention. Recognizing this, we had the following conversation this week internally, and are setting out to tackle this concern. That core workflow includes several components that are loosely coupled together:

  • the release metadata which allows us to identify the version of code an event is present in
  • the new alert (and other variations) approach to notifications
  • the new deployment notification - which requires a good amount of effort to achieve correctness
  • the resolved in release and ignore issue actions and the general triage flow
  • the regression notification which is key to understanding if an issue remains unresolved

These all string together to create a workflow that is tightly coupled to how I - and I believe most developers - think about shipping code. We dry run a bunch of changes, pray our tests are accurate enough, ship the code, and inevitably find a problem. The faster Sentry can connect those dots with accurate diagnostics, the better the outcomes for our customers.

The challenge here is that there's a fundamental gap in the workflow in how we notify about issues. We rely on "New Issue" to be timely and contextual. That is, we duct taped a solution that assumed most of the time new issues happened with new code. That's not always true, and technologies like JavaScript have decreased the signal to noise ratio over the years. So let's tackle that problem. There's a few key things we should look at as part of this:

  1. What does a timely notification look like connected to the release lifecycle? That should focus on how we help identify truly "new issues caused by code changes".
  2. How do those notifications change (or become addititive) with things like code push or feature flags, where we're making behavioral changes but not shipping a new SHA?
  3. How can we improve things like "Ignore Issue" to be more functional usable? We shouldn't rely on custom alerts, or custom saved searches for such common concerns.
  4. "Resolved in Next Release" can be problematic in many environments, but we now have SemVer support. Can we leverage that better?
  5. Where is fingerprint breaking down? Are there platforms that are not effective that we need to resolve our native heuristics?
  6. How does this apply to performance? to csp? to other "problems" (ala "issues")? Issues needs to be a platform to enable the workflow.

Most importantly, with all this in mind, how do we resolve the workflow without requiring customers to configure anything? We've - for better, or more likely worse - taken the approach over the years to put this problem into our customers hands by giving them complicated solutions to create alerts, complicated solutions to improve fingerprinting, and we've stopped trying to build a first-class curated solution to the problem. This is our chance to correct that.

For the community, if you have opinions, what are they? What can we do better here? What works well? What is completely terrible? We already have some solid foundations, but if you've got an opinion, let's hear it!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions