Quality Metrics & Health Signals
What pass rate, flaky rate, and other metrics mean, and how to interpret them for release decisions.
Certyn tracks several metrics that help you understand the health of your test suite and the risk profile of your environments.
Key metrics
| Metric | What it measures | Target | Red flag |
|---|---|---|---|
| Pass rate | % of test executions that passed | > 95% | < 90% |
| Flaky rate | % of tests with inconsistent results | < 5% | > 10% |
| Open bug count | Number of unresolved issues | Trending down | Trending up |
| Needs Review count | Items awaiting human decision | Near zero | Growing backlog |
| Mean time to detect | How quickly failures are caught | < 1 hour | > 4 hours |
Pass rate alone does not tell the full story
A 95% pass rate with 3 critical-severity failures is riskier than 90% with only cosmetic issues. Always check severity alongside pass rate.
Understanding flakiness
A test is flaky when it produces different results across runs without any product changes. Certyn measures flakiness using a flip-rate algorithm:
- A flip is when a test switches between pass and fail on consecutive runs
- Flakiness score = number of flips / (total runs - 1), expressed as a percentage
- Tests with a flakiness score above 20% (and at least 5 runs) are candidates for quarantine
Flaky tests erode confidence in your results. A flaky failure looks the same as a real failure, making it harder to spot actual bugs.
Risk signals to watch
These patterns indicate growing risk:
| Signal | What it means | Recommended action |
|---|---|---|
| Declining pass rate (week over week) | New failures are appearing | Investigate recent failures, check for regressions |
| Growing open-bug count | Issues are being created faster than resolved | Prioritize triage, consider blocking releases |
| Rising flaky rate | Test suite reliability is degrading | Quarantine worst offenders, investigate root causes |
| Stale Needs Review items | Human decisions are backlogged | Clear the review queue to prevent drift |
| Blocked runs | Tests cannot execute | Check environment health, data dependencies |
Reading pass-rate trends
| Pattern | Interpretation |
|---|---|
| Stable (< 5% variance week over week) | Suite is healthy, changes are not introducing regressions |
| Declining (> 5% drop week over week) | Regressions or environment instability — investigate |
| Volatile (frequent swings) | Likely flakiness — check flaky rate and quarantine unstable tests |
| Improving (> 5% rise week over week) | Fixes are landing and being verified |
Environment health
Each environment gets a daily health summary based on that day's executions. A healthy environment has:
- recent executions completing successfully
- pass rate above your project threshold
- no new critical bugs
- no stale blocked runs
