How I scaled a fitness platform to 60,000 users — solo

The database indexes, Redis caching, and N+1 fixes that took Buffdudes from slow to stable at 60K users.

By Hamza Ghouri

When I started on Buffdudes, it was a small workout-tracking app. By the time it crossed 60,000 users, I was still the only engineer on the backend — a solo Node.js, TypeScript, and Express service running against PostgreSQL on AWS. Scaling alone forces a particular discipline: you can't throw people at a problem, so every fix has to be the right one. Here is what actually moved the needle.

The 4-second query

The first real fire was the workout history screen. Every time a user opened their log, we queried the user_workouts table filtered by user and ordered by date. At a few thousand rows that was instant. At tens of millions of rows it was a sequential scan, and the endpoint crept up to around 4 seconds under load. People noticed.

The fix was unglamorous and enormously effective: a composite index matching the exact shape of the query — the user identifier plus the sort column, in the order the planner needed them. With the index in place, PostgreSQL could seek straight to a user's rows and walk them already ordered, instead of scanning and sorting the whole table. The same query dropped from ~4s to ~200ms. No new servers, no caching layer — just letting the database do what it's good at once it had the right index.

The lesson I keep relearning: before you reach for infrastructure, read the query plan. A single well-shaped index often beats a week of architecture.

Leaderboards and Redis

Leaderboards are deceptively expensive. They aggregate across the whole user base, they're read constantly, and they barely change second to second. Recomputing them on every request was pure waste.

So leaderboard queries went behind Redis. The expensive aggregation runs on a schedule, the result is cached, and the hot read path just serves from memory. Users get an instant leaderboard, and PostgreSQL stops doing the same heavy aggregation thousands of times a minute. Caching the things that are read far more often than they change is one of the highest-leverage moves available, and a leaderboard is the textbook case.

The N+1 in the social feed

The social feed had a classic N+1 problem. Loading a feed of posts issued one query for the posts, then — for each post — additional queries for its author, comments, and likes. One feed render could fan out into dozens of round trips to the database. Latency scaled with the number of items on screen, which is exactly backwards from what you want.

Collapsing those into a small number of batched queries (load the posts, then load all the related authors and engagement counts together) cut the round trips dramatically and made feed latency flat and predictable regardless of feed length. N+1 bugs hide easily because each individual query is fast; it's the multiplication that kills you.

Cron jobs that overlapped themselves

Finally, the background work. We had heavy recurring jobs running on cron. As data grew, some runs started taking longer than the interval between them — so a new run would kick off while the previous one was still going, and they'd step on each other. I migrated that work off raw cron into proper background services with controlled concurrency, which eliminated the overlap and made the heavy jobs safe to run at scale.

What held it together

None of this is exotic. A composite index, a Redis cache, a batched query, and disciplined background jobs — wrapped in CI/CD through GitHub Actions deploying to AWS Elastic Beanstalk with RDS, fronted by Nginx and Redis. Scaling to 60K users solo wasn't about clever tricks. It was about measuring the real bottleneck and fixing that one thing, over and over.

Want this kind of work on your product?

Get in touch