title | weight | description |
---|---|---|
Issue Triage Guidelines |
10 |
These guidelines serve as a primary document for triaging incoming issues to
Kubernetes. SIGs and projects are encouraged to use this guidance as a
starting point, and customize to address specific triaging needs.
|
- Scope
- What Is Triaging?
- Why Is Triaging Beneficial?
- How to Triage: A Step-by-Step Flow
- Step One: Review Newly Created Open Issues
- Step Two: Triage Issues by Type
- Step Three: Define Priority
- Step Four: Find and Set the Right SIG(s) to Own an Issue
- Step Five: Follow Up
- Footnotes
These guidelines serve as a primary document for triaging incoming issues to Kubernetes. SIGs and projects are encouraged to use this guidance as a starting point, and customize to address specific triaging needs.
Note: These guidelines only apply to the Kubernetes repository. Usage for other Kubernetes-related GitHub repositories is TBD.
Issue triage is a process by which a SIG intakes and reviews new GitHub issues and requests, and organizes them to be actioned—either by its own members, or by other SIGs. Triaging involves categorizing issues and pull requests based on factors such as priority/urgency, SIG ownership of the issue, and the issue kind (bug, feature, etc.)
- the SIG or SIGs responsible for handling the issue or pull request
Triage can happen asynchronously and continuously, or in regularly scheduled meetings. Several Kubernetes SIGs and projects have adopted their own approaches to triaging.
SIGs who triage regularly say it:
- speeds up issue management
- keeps contributors engaged by shortening response times
- prevents work from lingering endlessly
- replaces "special requests" and one-offs with a neutral process that acts like a boundary
- leads to greater transparency, interesting discussions, and more collaborative, informed decision-making
- it helps build prioritization, negotiation and decision-making skills, which are critical to most tech roles
- it reinforces SIG community and culture
People who enjoy product management and iterating on processes tend to enjoy triaging because it empowers their SIGs to maintain a steady, continuous flow of work that is assessed and prioritized based on feedback and value.
This aims to walk you through a standard triaging process, first covering tools and tips.
These tools that your SIG can use to make the process simpler, more efficient and faster.
Opening new issues and leaving comments on other people's issues are possible for all contributors. However, permission to assign specific labels (e.g. triage
), change milestones, or close other contributors issues is only granted to the author of an issue, assignees, and organization members. For this reason, we use a bot to manage labelling and triaging. For a full list of commands and permissions, see the Prow command reference page.
Gubernator offers a dashboard that tells you which pull requests are waiting for your feedback and which PRs are waiting for the contributor to respond. Please note that Gubernator only shows pull requests. You will not see which issues are assigned to you.
Triage Party is a tool for triaging incoming GitHub issues for large open-source projects, built with the GitHub API. Made public in April 2020, it facilitates "massively multi-player GitHub triage" and reduces contributor response latency.
Some of its features:
- Queries across multiple repositories
- Queries that are not possible on GitHub:
- conversation direction (
tag: recv
,tag: send
) - duration (
updated: +30d
) - regexp (
label: priority/.*
) - reactions (
reactions: >=5
) - comment popularity (
comments-per-month: >0.9
)
- conversation direction (
- Multiplayer mode: for simultaneous group triage of a pool of issues
- Button to open issue groups as browser tabs (pop-ups must be disabled)
- "Shift-Reload" for live data pull
GitHub offers project boards, set up like kanban boards, to help teams organize and track their workflow in order to get work done. The Release Team has come to depend on their project board for planning new Kubernetes releases; they also use it as an archive to show the work Done for past releases. Other SIGs using project boards:
We encourage more SIGs to use project boards to enhance visibility and tracking. If you'd like some help getting started, visit GitHub's documentation or reach out to SIG Contributor Experience.
The CNCF has created a suite of Grafana dashboards and charts for collecting metrics related to all the CNCF projects. The Kubernetes dashboard can be used to help SIGs view real-time metrics on many aspects of their workflow, including:
- Issue Velocity: How quickly issues are resolved
- PR Velocity: Including PR workload per SIG, PR time to approve and merge, and other data
Several SIGs consistently meet weekly or monthly to triage issues. Here are some details about their processes:
The api-machinery SIG has found that triage meetings offer valuable opportunities for newcomers to listen, learn, and start contributing. api-machinery hold triage meetings every Tuesday and Thursday and archive recordings via their Youtube playlist; here is an example.
In a typical triage meeting, api-machinery members sort through every issue that they haven't triaged since the previous meeting, using a simple query and issue # to track Open PRs and Issues. They usually then:
- read through the comments and the code briefly to understand what the issue is about.
- determine by consensus if it belongs to the api-machinery SIG or not. If not, remove the
sig/api-machinery
label. - label other SIGs, if appropriate
- discuss briefly the technical implications
- assign people with expertise in the domain to review, comment, reject, etc.
api-machinery has found that consistently meeting on a regular, fixed schedule is key to the success of a triaging effort. More frequent, small meetings are better than infrequent, large meetings, they've found. A few other pointers:
- We try to balance the load, and ask people if they are okay taking on an issue before assigning it to them
- We skip issues that are closed
- We also skip cherrypicks, because we consider that the code change was reviewed in the original PR
- We ensure participation from the entire SIG and support company diversity.
- We use this opportunity to mark "help needed", "good first issue"
The SIG has developed a triaging page detailing their process, including the Milestones stage. Here is a March 2020 presentation delivered to the SIG chairs and leads group on their process.
Kubernetes issues are listed here. New, untriaged issues come without labels attached. SIG leads should identify at least one SIG member to serve as a first point of contact for new issues.
Labels are the primary tools for triaging. Here's a comprehensive label list.
GitHub allows you to filter out types of issues and pull requests, which helps you discover items in need of triaging. This table includes some predetermined searches for convenience:
Search | What it sorts |
---|---|
created-asc | Untriaged issues by age |
needs-sig | Issues that need to be assigned to a SIG |
is:open is:issue |
Newest incoming issues |
comments-desc | Busiest untriaged issues, sorted by # of comments |
comments-asc | Issues that need more attention, based on # of comments |
We suggest preparing your triage by filtering out the oldest, unlabelled issues and/or pull requests first.
Use these labels to find open issues that can be quickly closed. A triage engineer can add the appropriate labels.
Depending on your permissions, either close or comment on any issues that are identified as support requests, duplicates, or not-reproducible bugs, or that lack enough information from the reporter.
Some people mistakenly use GitHub issues to file support requests. Usually they're asking for help configuring some aspect of Kubernetes. To handle such an issue, direct the author to use our support request channels. Then apply the kind/support
label, which is directed to our support structures, and apply the close
label.
Please find more detailed information about Support Requests in the Footnotes section.
Either close or comment on it.
- The
triage/needs-information
label indicates an issue needs more information in order to work on it; comment on or close it.
First, validate if the problem is a bug by trying to reproduce it.
If you can reproduce it:
- Define its priority
- Do a quick duplicate search to see if the issue has been reported already. If a duplicate is found, let the issue reporter know, reference the original issue, and close the duplicate.
If you can't reproduce it:
- Contact the issue reporter with your findings
- Close the issue if both the parties agree that it could not be reproduced.
If you need more information to further work on the issue:
- Let the reporter know it by adding an issue comment followed by label
lifecycle/needs-information
.
In all cases, if you do not get a response in 20 days then close the issue with an appropriate comment. If you have permission to close someone else's issue, first /assign
the issue to yourself, then /close
it. If you do not, please leave a comment describing your findings.
To identify issues that are specifically groomed for new contributors, we use the help wanted and good first issue labels. To use these labels:
- Review our specific guidelines for how to use them.
- If the issue satisfies these guidelines, you can add the
help wanted
label with the/help
command and thegood first issue
label with the/good-first-issue
command. Please note that adding thegood first issue
label will also automatically add thehelp wanted
label. - If an issue has these labels but does not satisfy the guidelines, please ask for more details to be added to the issue or remove the labels using the
/remove-help
or/remove-good-first-issue
commands.
Usually the kind
label is applied by the person submitting the issue. Issues that feature the wrong kind
(for example, support requests labelled as bugs) can be corrected by someone triaging; double-checking is a good approach. Our issue templates aim to steer people to the right kind.
We use GitHub labels for prioritization. If an issue lacks a priority
label, this means it has not been reviewed and prioritized yet.
We aim for consistency across the entire project. However, if you notice an issue that you believe to be incorrectly prioritized, please leave a comment offering your counter-proposal and we will evaluate it.
Priority label | What it means | Examples |
---|---|---|
priority/critical-urgent | Team leaders are responsible for making sure that these issues (in their area) are being actively worked on—i.e., drop what you're doing. Stuff is burning. These should be fixed before the next release. | user-visible bugs in core features broken builds tests and critical security issues |
priority/important-soon | Must be staffed and worked on either currently or very soon—ideally in time for the next release. Important, but wouldn't block a release. | [XXXX] |
priority/important-longterm | Important over the long term, but may not be currently staffed and/or may require multiple releases to complete. Wouldn't block a release. | [XXXX] |
priority/backlog | General agreement that this is a nice-to-have, but no one's available to work on it anytime soon. Community contributions would be most welcome in the meantime, though it might take a while to get them reviewed if reviewers are fully occupied with higher-priority issues—for example, immediately before a release. | [XXXX] |
priority/awaiting-more-evidence | Possibly useful, but not yet enough support to actually get it done. | Mostly placeholders for potentially good ideas, so that they don't get completely forgotten, and can be referenced or deduped every time they come up |
Components are divided among Special Interest Groups (SIGs). The bot assists in finding a proper SIG to own an issue.
- For example, typing
/sig network
in a comment should add the sig/network label. - Multiword SIGs use dashes: for example,
/sig cluster-lifecycle
. - Keep in mind that these commands must be on their own lines, and at the front of the comment.
- If you are not sure about who should own an issue, defer to the SIG label only.
- If you feel an issue should warrant a notification, ping a team with an @ mention, in this format:
@kubernetes/sig-<group-name>-<group-suffix>
.- Here, the
<group-suffix>
can be one ofbugs, feature-requests, pr-reviews, test-failures, proposals
. For example,@kubernetes/sig-cluster-lifecycle-bugs, can you have a look at this?
- Here, the
If you think you can fix the issue, assign it to yourself with just the /assign
label. If you cannot self-assign for permissions-related reasons, leave a comment that you'd like to claim it and work on creating a PR.
If you see any issue which is owned by a developer but a PR is not created in 30 days, a Triage engineer should contact the issue owner and ask them to either create a PR or release ownership.
If you find an issue with a SIG label assigned, but there's no evidence of movement or discussion within 30 days, then gently poke the SIG about this pending issue. Also, consider attending one of their meetings to bring up the issue, if you feel this is appropriate.
When this happens, the fejta-bot
adds the lifecycle/stale
label to that issue. You can block the bot by applying the /lifecycle frozen
label preemptively, or remove the label with the /remove-lifecycle stale
command. The fejta-bot
adds comments in the issue that include additional details. If you take neither step, the issue will eventually be auto-closed.
These should be directed to the following:
If you see support questions on kubernetes-dev@googlegroups.com or issues asking for support, try to redirect them to Discuss. Example response:
Please re-post your question to our [Discussion Forums](https://discuss.kubernetes.io).
We are trying to consolidate the channels to which questions for help/support
are posted so that we can improve our efficiency in responding to your requests,
and to make it easier for you to find answers to frequently asked questions and
how to address common use cases.
We regularly see messages posted in multiple forums, with the full response
thread only in one place or, worse, spread across multiple forums. Also, the
large volume of support issues on GitHub is making it difficult for us to use
issues to identify real bugs.
Members of the Kubernetes community use Discussion Forums to field
support requests. Before posting a new question, please search these for answers
to similar questions, and also familiarize yourself with:
* [user documentation](https://kubernetes.io/docs/home/)
* [troubleshooting guide](https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/)
Again, thanks for using Kubernetes.
The Kubernetes Team