Skip to content

Progressive Feature Rollout

⏱ 35 minutes intermediate

Deploying a new feature to 100% of users at once is a bet that nothing will go wrong. When something does — a performance regression, an edge case crash, a UX confusion — every user is affected simultaneously. Rollback means a full redeployment under pressure.

Progressive rollout replaces this all-or-nothing approach with controlled exposure. Feature flags gate access to the new feature. You start with 1% of traffic, monitor key metrics, then increase the percentage at each stage. If metrics degrade at any stage, you reduce the flag percentage without deploying code.

┌────────────────────────────────────────────────┐
│ Feature Experimentation SDK │
│ Application checks flag for each request │
└───────────────────────┬────────────────────────┘
┌─────────────┼─────────────┐
▼ ▼ ▼
┌────────────┐ ┌──────────┐ ┌──────────┐
│ Stage 1 │ │ Stage 2 │ │ Stage 3 │
│ 1% users │ │ 10% │ │ 50% │
│ internal │ │ canary │ │ broad │──► 100%
└──────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
▼ ▼ ▼
┌────────────────────────────────────────┐
│ Monitoring Gates │
│ Error rate < 0.1% │
│ P95 latency < 200ms │
│ No critical alerts │
└────────────────────────────────────────┘

In Feature Experimentation, create a flag for your new feature. Define variables for any configuration values the feature needs (API endpoints, display thresholds, copy strings). This lets you adjust behavior without code changes.

Set the default state to off. No user sees the feature until you explicitly enable it.

Wrap the new feature code path behind the flag check. Users who evaluate to “on” see the new feature. Users who evaluate to “off” see the existing behavior. Both code paths must be production-ready.

Ensure the flag check is fast and handles SDK failures gracefully — if the SDK cannot reach the server, the default (off) state should apply.

StageAudienceTrafficDurationGate criteria
InternalEmployee email domain100% of employees2 daysNo critical bugs reported
CanaryRandom 1%1% of all users3 daysError rate < 0.1%, latency < 200ms
LimitedRandom 10%10% of all users5 daysSame metrics + no support tickets
BroadRandom 50%50% of all users3 daysConversion rate neutral or positive
FullEveryone100%PermanentRemove flag in next release

At each stage, check your monitoring dashboard for:

  • Error rates — any increase in exceptions related to the feature area
  • Latency — P95 response time for endpoints the feature touches
  • Business metrics — conversion rate, engagement, revenue per session
  • Support volume — new tickets mentioning the feature area

Only advance to the next stage when all gate criteria pass for the full stage duration.

If metrics breach thresholds at any stage, reduce the flag percentage to 0%. This immediately stops exposing users to the new feature without deploying code. Investigate the issue, fix it, redeploy, and restart the rollout from Stage 1.

Use progressive rollout for any user-facing feature change that could impact performance, conversion, or user experience. Skip it for backend refactors with no user-visible effect, or for urgent hotfixes where the risk of not deploying outweighs the risk of deploying broadly.