How do I calculate burn rate from Prometheus metrics?

Burn rate = (error_rate / error_budget). In Prometheus: (sum(rate(http_requests_total{status=~'5..'}[1h])) / sum(rate(http_requests_total[1h]))) / (1 - 0.999). Replace 0.999 with your SLO. Use recording rules for efficiency across multiple windows.

Technology

SRE Burn Rate Calculator

Calculate how fast your service burns its error budget against SLO targets. Essential for SRE multi-window, multi-burn-rate alerting strategies.

SLO Preset

SLO Configuration

SLO Target

SLO Period

Current Error Rate

Contributor

Reviewed by

Last updated: July 18, 2026

Support

Related Calculators

You might also find these calculators useful

Error Budget Calculator

Calculate SRE error budgets from SLO targets for reliability engineering

SLA Downtime Calculator

Calculate allowed downtime from SLA percentage and check compliance

Website Uptime Impact Calculator

Calculate downtime costs and revenue impact

MTTR Calculator

Calculate Mean Time To Repair for incident management

Calculate Error Budget Burn Rates for SLO-Based Alerting

Burn rate is how fast your service is consuming its error budget relative to your SLO. Our SRE Burn Rate Calculator helps you understand consumption rates, configure multi-window alerting thresholds, and determine appropriate response actions based on Google's Site Reliability Engineering best practices.

What is Burn Rate in SRE?

Burn rate measures how quickly you're consuming your error budget relative to your Service Level Objective (SLO). A burn rate of 1 means you'll exactly exhaust your budget at the end of the SLO period. A burn rate of 2 means you're consuming budget twice as fast and will exhaust it in half the time. Higher burn rates indicate more urgent reliability issues requiring immediate attention. Google SRE recommends multi-window, multi-burn-rate alerting to balance detection speed with alert precision.

Burn Rate Formula

How to Use This Calculator

Common Use Cases

Configuring Prometheus/Alertmanager Rules

Use the calculator to determine burn rate thresholds for your alerting rules. For a 99.9% SLO, configure alerts when rate(errors[1h])/rate(total[1h]) exceeds 14.4 * 0.001 (critical) and 6 * 0.001 (high) for appropriate severity routing.

Incident Impact Assessment

During an incident, calculate the burn rate to understand urgency. A 15% error rate on a 99.9% SLO (0.1% allowed) means 150x burn rate—extremely critical, exhausting budget in under an hour. This justifies all-hands response.

Post-Incident Analysis

After an incident, calculate how much budget was consumed. If a 2-hour incident had a 10x burn rate, it consumed 10 × (2/720) = 2.8% of the monthly budget. Use this to decide if reliability work should take priority.

SLO Tuning and Validation

Test whether your SLO is appropriate by analyzing historical burn rates. If you're consistently at 0.5x burn rate (50% budget remaining), your SLO may be too conservative. If regularly exceeding 1x, consider loosening the SLO or investing in reliability.

Why Burn Rate Matters for SRE

Better Alert Precision

Simple threshold alerts fire on any SLO violation, even brief ones. Burn rate alerts combine error rate with duration, ensuring alerts correspond to significant budget consumption. A 14.4x burn rate over 1 hour means 2% budget consumed—worthy of a page. A momentary spike that self-corrects doesn't alert unnecessarily.

Faster Detection Time

By using multiple windows (1 hour, 6 hours, 3 days), burn rate alerting catches both fast and slow burns. A 100% outage triggers 14.4x burn rate alerts in ~4 minutes, while gradual degradation is caught by slower windows before exhausting the budget completely.

Appropriate Severity Routing

Different burn rates warrant different responses. Fast burns (14.4x) should page immediately. Medium burns (6x) may page or create urgent tickets. Slow burns (1x-3x) create tickets for next-day investigation. This prevents alert fatigue while ensuring issues are addressed appropriately.

Quick Alert Reset

Multi-window alerting uses short windows (5 min, 30 min) alongside long windows. When the issue resolves, the short window clears quickly, resetting the alert. This prevents alerts from firing for hours after an incident is resolved, reducing confusion during and after incidents.

Frequently Asked Questions

Google SRE recommends: 14.4x burn rate (2% budget in 1 hour) should page immediately. 6x burn rate (5% budget in 6 hours) should also page. 3x burn rate (10% budget in 24 hours) can be a ticket. 1x burn rate (budget depleting on schedule) is a low-priority ticket. Adjust based on your on-call capacity and SLO criticality.

Single-window burn rate alerts have poor reset time. A 1-hour window continues alerting for an hour after the incident resolves. Multi-window alerting adds a short window (e.g., 5 minutes) that must also exceed threshold. This ensures alerts reset quickly when the issue is resolved while maintaining detection accuracy.

Error rate is absolute (e.g., 0.5% of requests fail). Burn rate is relative to your SLO (e.g., 0.5% error rate with 0.1% allowed = 5x burn rate). Burn rate normalizes across different SLOs—a 5x burn rate is equally urgent whether your SLO is 99% or 99.99%.

A burn rate below 1x means you're consuming budget slower than allowed—your service is more reliable than required. At 0.5x, you'll have 50% budget remaining at period end. This is healthy! Consider using spare budget for faster deployments or riskier experiments. If consistently very low, your SLO may be too conservative.

calculators.sre-burn-rate-calculator.seo.faq.items.4.answer

Request-based burn rate (error rate as % of requests) is more common and easier to measure for APIs. Time-based (% of time unavailable) works better for binary availability. Most services use request-based because partial degradation is meaningful—50% errors is different from 100% outage.

Calculate Error Budget Burn Rates for SLO-Based Alerting

What is Burn Rate in SRE?

Burn Rate Formula