What I Check Before Shipping Revenue-Critical Frontend Features

The difference between a feature that works in staging and one that's production-ready is a 47-point checklist I learned the hard way.

When you're shipping checkout flows, payment processing, or anything that touches money, "it works on my machine" isn't enough. A single edge case can cost six figures in refunds and support burden.

Context: The Real Problem

Most production incidents I've debugged weren't caused by complex bugs. They were caused by unhandled states nobody thought to test:

User closes tab mid-payment
Network drops after request sent but before response
User refreshes page during processing
Backend returns success but with error payload
Session expires during multi-step flow

These scenarios don't show up in happy-path testing. They show up at 2 AM when users are calling support.

Why This Matters in Production

In a system processing $10M monthly transactions:

1% failure rate = $100K in refunds
5-minute outage = $35K lost revenue
One duplicate charge = trending on Twitter

The frontend isn't just UI. It's the last line of defense before user money moves.

My Pre-Ship Checklist

1. Edge Case Mapping

Before writing a single line of code, I enumerate every possible state:

// For a payment flow, map ALL states
type PaymentState = 
  | { status: 'idle' }
  | { status: 'validating'; field: string }
  | { status: 'submitting' }
  | { status: 'processing'; transactionId: string }
  | { status: 'succeeded'; orderId: string }
  | { status: 'failed'; reason: string; retryable: boolean }
  | { status: 'timeout'; transactionId: string }
  | { status: 'unknown'; transactionId: string }

I verify:

UI handles every state explicitly
No undefined states exist
State transitions are valid
Loading states prevent duplicate submissions

Red flags:

Boolean flags instead of enums
Implicit state in multiple variables
No handling for "unknown" state

2. Failure State Enumeration

I categorize every failure mode:

Network failures:

Request never sent (offline)
Request sent, no response (timeout)
Response received, can't parse (malformed)
Response indicates retry (503, 429)

Business logic failures:

Insufficient funds
Invalid payment method
Duplicate transaction detected
Fraud check failed

System failures:

Session expired
Rate limited
Service degraded
Maintenance mode

For each failure, I define:

User-facing message
Recovery action (retry, contact support, try different method)
Analytics event
Support ticket data

3. Monitoring Hooks

Before shipping, I instrument everything:

function PaymentFlow() {
  const { track } = useAnalytics()
  
  useEffect(() => {
    track('payment_flow_viewed', {
      flowVersion: 'v2.1',
      timestamp: Date.now(),
    })
  }, [])
  
  const handleSubmit = async (data: PaymentData) => {
    const startTime = Date.now()
    track('payment_submit_attempted', { amount: data.amount })
    
    try {
      const result = await processPayment(data)
      
      track('payment_submit_succeeded', {
        orderId: result.orderId,
        duration: Date.now() - startTime,
      })
      
      return result
    } catch (error) {
      track('payment_submit_failed', {
        errorType: error.type,
        errorMessage: error.message,
        duration: Date.now() - startTime,
        retryable: error.retryable,
      })
      
      throw error
    }
  }
  
  // ...
}

I track:

Every state transition
Time spent in each state
Errors with full context
User actions (clicks, inputs, navigation)
Performance metrics (render time, API latency)

4. Feature Flags with Circuit Breakers

All revenue-critical features ship behind flags:

const { isEnabled, config } = useFeatureFlag('new-payment-flow')

if (!isEnabled) {
  return <LegacyPaymentFlow />
}

// Also check circuit breaker
const { isHealthy } = useCircuitBreaker('payment-service')

if (!isHealthy) {
  return <PaymentDegradedState />
}

This allows:

Instant rollback without deploy
Gradual rollout (1% → 10% → 50% → 100%)
A/B testing with revenue metrics
Emergency kill switch

5. Rollback Strategy

Before shipping, I document:

Rollback triggers:

Error rate > 5% for 5 minutes
P95 latency > 3 seconds
Conversion rate drops > 10%
Support tickets spike > 3x baseline

Rollback process:

Toggle feature flag off (instant)
Verify metrics return to baseline
Communicate to stakeholders
Debug in staging

Data implications:

Can users resume interrupted flows?
Do we need to migrate incomplete transactions?
How do we handle users mid-flow during rollback?

6. Load Testing Revenue Paths

I test under realistic load:

# Simulate 1000 concurrent checkout attempts
k6 run --vus 1000 --duration 30s checkout-load-test.js

I verify:

No race conditions under concurrency
Request deduplication works
Rate limiting behaves correctly
Database locks don't deadlock
Error messages stay meaningful under load

Red flags:

Performance degrades non-linearly
Error rates increase with load
Timeouts occur intermittently
Memory usage grows unbounded

7. Idempotency Verification

For any mutation, I verify:

// Every mutation needs an idempotency key
async function submitOrder(orderData: OrderData) {
  const idempotencyKey = crypto.randomUUID()
  
  return await api.post('/orders', orderData, {
    headers: {
      'Idempotency-Key': idempotencyKey,
    },
  })
}

// Store it for retries
sessionStorage.setItem('order-idempotency-key', idempotencyKey)

I test:

Duplicate submission with same key = same result
Network retry doesn't create duplicates
Browser refresh doesn't duplicate order
Back button doesn't duplicate action

8. Session & Auth Edge Cases

I verify:

Session expiry during flow redirects to login, preserves state
Login redirect returns to correct flow step
Token refresh happens transparently
CSRF tokens regenerate correctly
Multi-tab behavior is safe

9. Browser Compatibility Matrix

For revenue-critical paths, I test:

Chrome (current, current-1)
Safari (current, iOS Safari)
Firefox (current)
Edge (current)

Specifically testing:

Form validation
Payment input masking
Autofill behavior
Back/forward navigation
Refresh mid-flow

10. Observability Verification

Before shipping, I verify I can answer:

How many users are in this flow right now?
What's the current success rate?
What's the P95 latency?
What's the most common error?
Can I filter by user ID to debug specific issues?

Tradeoffs

Comprehensive testing:

✅ Prevents costly incidents
❌ Slower shipping velocity
❌ Requires more upfront planning

Heavy instrumentation:

✅ Fast debugging in production
❌ Performance overhead
❌ Analytics data costs

Feature flags:

✅ Safe rollout & instant rollback
❌ Code complexity
❌ Technical debt if not cleaned up

Pessimistic UI:

✅ No false positives
❌ Feels slower to users
❌ Requires user education

What I'd Do Differently Today

If starting fresh:

Build monitoring first. Ship instrumentation before features. You can't debug what you can't see.
Write failure scenarios as tests. Use tools like Playwright to test network failures, timeouts, and race conditions.
Default to pessimistic UI. Optimistic updates are for social features, not payment flows.
Document state machines visually. A diagram prevents bugs better than code review.
Practice rollbacks. Do it in staging monthly so you're confident in production.

The goal isn't zero bugs. The goal is zero costly bugs. Every check on this list prevents a production incident that wakes someone up at 2 AM.

The difference between a senior engineer and a staff engineer is knowing which corners you can cut and which ones will cost six figures.

Revenue-critical features? Don't cut corners.