Scaling Rails Series

April 22, 2025

Rails makes it pretty easy to get started for development. You don't even have to setup your database. It comes with SQLite. Install Rails and you start the development.

Same with deploying to production. You need to change your database. Other than that it comes with sane defaults. You don't really need to know what's RAILS_MAX_THREADS and what's WEB_CONCURRENCY. However as your application starts getting more traffic you want to scale your application.

Over the last 13 years of consultancy at BigBinary we have seen all types of applications.

We have seen Rails applications which are IO heavy like scraping websites. There are applications which are heavy on background jobs. Then there are flash sale sites where there is no traffic one minute and next minutes there is tons of traffic. Then there are ticketing sites.

Each application has its own challenges.

If an application is not properly tuned then we can run into all kinds of issues. You can run out of memory or database connections. In some cases because of the wrong configuration Sidekiq jobs were failing and these jobs continued to get enqueued which made the situation worse.

To solve all these types of issues we need to know underneath what's actually happening. And that's what we will do. We will look at under the hood to see how Rails works, how puma works, how connection pooling works, how to tune Sidekiq and how to measure what we need to know to make decisions.

It's a journey to understand from ground up how to scale Rails applications. This page will have links to all the future blogs. You can follow the Scaling Rails series by joining the newsletter or following us on twitter or LinkedIn. We even have an RSS feed.

Part 1 - Understanding Puma, Concurrency, and the Effect of the GVL on Performance

How Puma handles requests
Default Puma configuration in a new Rails application
Web applications and CPU usage
CPU bound or IO bound
Concurrency vs Parallelism
Understanding the GVL
GVL dictates how many processes you will need
Thread switching
Visualizing the effect of the GVL

Part 2 - Amdahl's Law: The Theoretical Relationship Between Speedup and Concurrency

Amdahl's law
Relationship between speedup gained and the number of threads
Ideal number of threads in a process
Request queue time

Part 3 - Finding ideal number of threads per process using GVL Instrumentation

GVL instrumentation using perfm
Determining the I/O workload of an application using the GVL data
Empirically determining the ideal number of Puma threads for an application

Part 4 - Understanding Active Record Connection Pooling

Database connection pooling
Active Record connection pool implementation
Connection pool configuration options
Active Record connection pool reaper
How many database connections will the web and background processes utilize at maximum?
How does using load_async affect the connection usage?
Setting database pool size configuration
PgBouncer
Tracking down ActiveRecord::ConnectionTimeoutError
Monitoring Active Record connection pool stats

Part 5 - Understanding Queueing Theory

Queueing systems
Basic terminology in queueing theory
Little's law
The knee curve
Theoretical parallelism
Concurrency and effective parallelism

If this blog was helpful, check out our full blog archive.