Building Scalable APIs: Lessons from Production at Scale

Published on February 2, 202612 min readEngineering & Architecture
API Architecture Concept

There's a massive gap between building an API that works and building one that scales. After three years maintaining APIs that handle 50M+ requests per day, I've collected hard-won lessons that I wish someone had told me on day one.

Context: This post focuses on REST APIs serving web and mobile clients. The principles apply broadly, but specific implementations may vary for GraphQL, gRPC, or real-time systems.

The Database Will Be Your Bottleneck

Not "might be." Will be. Most performance issues I've debugged ultimately traced back to database queries. The good news? Most are preventable with proper design upfront.

N+1 Queries: The Silent Killer

You've probably seen this pattern:

// Get all users
const users = await db.query('SELECT * FROM users LIMIT 10');

// For each user, get their posts (N queries!)
for (const user of users) {
    user.posts = await db.query('SELECT * FROM posts WHERE user_id = ?', [user.id]);
}

That's 1 query + 10 queries = 11 database round trips. At 5ms per query, you've blown 55ms before you've even started processing. Under load, this becomes seconds.

The fix: Eager loading with joins or batching.

// Single query with JOIN
const results = await db.query(`
    SELECT users.*, posts.*
    FROM users
    LEFT JOIN posts ON posts.user_id = users.id
    WHERE users.id IN (...)
`);

Index Everything You Query On

This seems obvious, but I've seen production APIs doing full table scans on millions of rows because someone forgot to add an index. Monitor your slow query logs religiously.

Query Pattern Index Required Why
WHERE user_id = ? Index on user_id Direct lookup
WHERE status = ? ORDER BY created_at Composite: (status, created_at) Filter + sort in one pass
WHERE email LIKE 'john%' Index on email Prefix search (not %john%!)

Caching: Your Best Friend and Worst Enemy

Caching is how you go from surviving to thriving. But poorly implemented caching causes more bugs than slow queries ever did.

Cache Invalidation: The Hard Parts

Rule #1: If you can't invalidate it correctly, don't cache it. Stale cache is worse than no cache.

My caching strategy hierarchy:

  1. Short TTLs for everything (30-60 seconds) — your safety net
  2. Tagged invalidation — clear related keys when data changes
  3. Cache-aside pattern — app controls both read and write
  4. Monitoring cache hit rates — know when it's helping
// Cache-aside pattern
async function getUser(userId) {
    const cacheKey = `user:${userId}`;
    
    // Try cache first
    let user = await redis.get(cacheKey);
    if (user) return JSON.parse(user);
    
    // Cache miss - hit database
    user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
    
    // Store in cache (60s TTL)
    await redis.setex(cacheKey, 60, JSON.stringify(user));
    
    return user;
}

Rate Limiting: Protect Yourself from Yourself

Rate limiting isn't just about protecting against bad actors. It's about preventing one client (or one buggy deploy) from taking down your entire service.

Multi-Tier Limits

Return proper headers so clients can self-regulate:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1707542400

Monitoring That Actually Matters

Don't just measure uptime. Measure experience.

The Four Golden Signals

  1. Latency: How long does it take? (P50, P95, P99)
  2. Traffic: How many requests per second?
  3. Errors: What's your error rate?
  4. Saturation: How full are your resources?
Pro tip: Alert on P95 latency, not averages. Averages hide the pain of your worst users.

API Versioning: Plan for Change

You will need to make breaking changes. The only question is whether you planned for it.

My preferred approach: URL versioning with a deprecation timeline.

Testing at Scale

Load testing isn't optional. You need to know your breaking point before your users find it.

What to Test

Tools like k6, Gatling, or Apache JMeter can simulate realistic load patterns. Run them regularly, not just before launch.

The Checklist

Before you ship your next API endpoint:

Final Thoughts

Building scalable APIs is as much about discipline as it is about technology. The patterns above won't solve every problem, but they'll prevent most of the common ones.

The best advice I can give: measure everything, assume nothing, and learn from production. Your monitoring dashboard will teach you more than any blog post ever could.

Author Avatar
Matt Forbush
Engineering Lead • Infrastructure & Performance