Closed
Description
As Collector is moving towards the 1.0 GA milestone, the Technical Committee is recommending to conduct a reliability review of the collector.
Motivation
- The TC is formally accountable for the quality of the software produced by the OpenTelemetry project. Similar to the TC’s due diligence conducted for 1.0 milestones for the language SIGs, this reliability review is a way for the TC to conduct a comprehensive overview of the Collector’s architecture, its expected behavior in production under stress conditions, and to provide feedback to the maintainers.
- OTel is applying for graduation at CNCF. Part of the application process is a similar due diligence by the CNCF TOC, so this internal review will better prepare the project for the graduation.
Process
Reliability review is a process commonly accepted at big tech companies for new systems / big milestones. In involves the following steps:
- The organization prepares a questionnaire template used for such reviews. The TC recommends this template.
- The Collector maintainers fill out the questionnaire async. “Not possible” or “not implemented” are acceptable answers, the objective is to have an honest reflection of the current state.
- The TC members review the questionnaire async, asking for clarifications. The objective is to ensure the specific concern of each question is discussed by the maintainers.
- The TC and the maintainers meet for a sync discussion.
Expected Outcomes
- The final report is published as part of the GA readiness documentation that informs the users of Collector’s expected behavior in production.
- Potentially a set of documentation tasks (maybe creating playbooks)
- Potentially re-prioritizing some components/capabilities not otherwise in the 1.0 scope
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done