This blog is part of our Rails 7 series.
The default SQLite3 Active Record adapter connection configuration has been updated in Rails 7.1 to better tune it to work for modern Rails applications.
Before moving onto the configuration changes let us understand what a PRAGMA is. PRAGMA is a special kind of SQL statement(only available in SQLite) which is used to query or manipulate various aspects of the database's behaviour and configuration.
Configuration Changes
1. journal_mode is now set use WAL instead of using the rollback journal
journal_mode is a configuration setting in SQLite that determines how the database handles transactions and maintains data integrity in the face of system crashes or unexpected shutdowns. Previously, a brand new Rails application used to use the rollback journal with DELETE journal mode. Now it uses a much more efficient journaling method named Write-Ahead Logging (WAL). Let us understand the difference between them.
Rollback journal: In this implementation, the database engine first records the original unchanged database content in a rollback journal and the writes are made directly to the database file. If the system crashes, the rollback journal can be used to restore the database to the state it was in before the transaction began. The issue with this approach is that a writer can alter the database or readers can read from the database — but not both at the same time.
Write-ahead logging: In WAL mode, SQLite maintains a separate write-ahead log. Instead of writing directly to the database file, changes are written to the log first. When a reader needs a page of content, it first checks the WAL to see if the page appears there, and if so it pulls in the latest copy of the page in the WAL. If no copy of the page exists in the WAL, then the page is read from the original database file. So this means readers and writers can work together and there is no contention.
How is the database file made up-to-date with the WAL file?
SQLite routinely moves the WAL file transactions back into the database. This process is called checkpointing. By default, SQLite does a checkpoint automatically when the WAL file reaches a threshold size of 1000 pages.
If a system crash occurs, the last commit record has not been written to the WAL file. Without the commit record, the new data is not considered valid, and the database simply ignores it.
WAL is a better choice for web applications because of the increased concurrency it offers.
Here is a nice article on how WAL works under the hood.
2. synchronous is now set to NORMAL instead of FULL
The synchronous pragma controls how and when SQLite flushes content to disk. The two common options are FULL and NORMAL, which map to “sync on every write” and “sync every 1000 written pages” respectively. FULL synchronous is very safe but slow. When synchronous is NORMAL, the database engine will sync at the most critical moments, but less often than in FULL mode. We trade an aggressive approach to durability for speed.
The SQLite documentation suggests using NORMAL for applications running in WAL mode.
3. journal_size_limit is now capped at 64MB
The journal_size_limit pragma tells SQLite how much of the write-ahead log data to keep in the on-disk file. Previously, it was set to -1 which means there was no limit set on the journal size, which allows it to grow unbounded, thereby potentially affecting read performance. Now it is capped at an appropriate size of 64MB.
4. cache_size is now set to 8MB
The cache_size pragma sets the maximum number of database disk pages that SQLite will hold in memory at once, per open database file. The default value was -2000 i.e 2000 bytes. Please note that SQLite interprets a negative value as a byte limit and positive number as a page limit. Now the cache_size is set to 2000(pages) with a default page size of 4096 bytes, which means the cache limit is ~8MB.
5. mmap_size is now set to 128MB
The mmap_size pragma sets the maximum number of bytes that are set aside for memory-mapped I/O on a single database. Let us first understand what memory-mapped I/O is.
Memory-mapped (mmap) I/O is an OS-provided feature that maps the contents of a file on secondary storage into a program’s address space. The program then accesses pages via pointers as if the file resided entirely in memory. The OS transparently loads pages only when the program references them and automatically evicts pages if memory fills up. The advantage of use mmap is that it bypasses the step where we need to copy the pages from secondary to primary storage, thereby making it faster.
How is memory-mapped I/O implemented in SQLite?
SQLite accesses and updates database files using xRead() and xWrite() methods by default. These methods are typically implemented as read() and write() system calls which causes the OS to copy disk content between the kernel buffer cache and user space. SQLite also has the option of accessing disk content directly using memory-mapped I/O via the xFetch() and xUnfetch() methods. Using the legacy xRead() method in SQLite, a page-sized heap memory block is allocated, and the xRead() call copies the entire database page content into this allocated memory. Whereas if memory mapped I/O is enabled, it calls the xFetch() method. The xFetch() method asks the operating system to return a pointer to the requested page. If the requested page has been or can be mapped into the application address space, then xFetch() returns a pointer to that page for SQLite to use without having to copy anything. Skipping the copy step is what makes memory mapped I/O faster.
The mmap_size is the maximum number of bytes of the database file that SQLite will try to map into the process address space at one time. Now it is set to 128MB.
With these changes, there is a considerable improvement in performance. That being said, SQLite makes a strong case for single-node production applications, as it is highly performant, especially when used in conjunction with NVMe disks.
Please check out this pull request for more details.