October 31, 2023
This blog is part of our Rails 7 series.
The default SQLite3 Active Record adapter connection configuration has been updated in Rails 7.1 to better tune it to work for modern Rails applications.
Before moving onto the configuration changes let us understand what a PRAGMA is. PRAGMA is a special kind of SQL statement(only available in SQLite) which is used to query or manipulate various aspects of the database's behaviour and configuration.
journal_mode
is now set use WAL
instead of using the rollback journaljournal_mode is a
configuration setting in SQLite that determines how the database handles
transactions and maintains data integrity in the face of system crashes or
unexpected shutdowns. Previously, a brand new Rails application used to use the
rollback journal with DELETE
journal mode. Now it uses a much more efficient
journaling method named Write-Ahead Logging (WAL
). Let us understand the
difference between them.
Rollback journal: In this implementation, the database engine first records the original unchanged database content in a rollback journal and the writes are made directly to the database file. If the system crashes, the rollback journal can be used to restore the database to the state it was in before the transaction began. The issue with this approach is that a writer can alter the database or readers can read from the database — but not both at the same time.
Write-ahead logging: In WAL
mode, SQLite maintains a separate write-ahead
log. Instead of writing directly to the database file, changes are written to
the log first. When a reader needs a page of content, it first checks the WAL
to see if the page appears there, and if so it pulls in the latest copy of the
page in the WAL
. If no copy of the page exists in the WAL
, then the page is
read from the original database file. So this means readers and writers can work
together and there is no contention.
How is the database file made up-to-date with the WAL
file?
SQLite routinely moves the WAL
file transactions back into the database. This
process is called checkpointing. By default, SQLite does a checkpoint
automatically when the WAL
file reaches a threshold size of 1000 pages.
If a system crash occurs, the last commit record has not been written to the
WAL
file. Without the commit record, the new data is not considered valid, and
the database simply ignores it.
WAL
is a better choice for web applications because of the increased
concurrency it offers.
Here is a nice article on how WAL
works under the hood.
synchronous
is now set to NORMAL
instead of FULL
The synchronous pragma
controls how and when SQLite flushes content to disk. The two common options are
FULL
and NORMAL
, which map to “sync on every write” and “sync every 1000
written pages” respectively. FULL
synchronous is very safe but slow. When
synchronous is NORMAL
, the database engine will sync at the most critical
moments, but less often than in FULL
mode. We trade an aggressive approach to
durability for speed.
The SQLite documentation suggests using NORMAL
for applications running in
WAL
mode.
journal_size_limit
is now capped at 64MBThe
journal_size_limit
pragma tells SQLite how much of the write-ahead log data to keep in the on-disk
file. Previously, it was set to -1
which means there was no limit set on the
journal size, which allows it to grow unbounded, thereby potentially affecting
read performance. Now it is capped at an appropriate size of 64MB.
cache_size
is now set to 8MBThe cache_size pragma sets the maximum number of database disk pages that SQLite will hold in memory at once, per open database file. The default value was -2000 i.e 2000 bytes. Please note that SQLite interprets a negative value as a byte limit and positive number as a page limit. Now the cache_size is set to 2000(pages) with a default page size of 4096 bytes, which means the cache limit is ~8MB.
mmap_size
is now set to 128MBThe mmap_size pragma sets the maximum number of bytes that are set aside for memory-mapped I/O on a single database. Let us first understand what memory-mapped I/O is.
Memory-mapped (mmap) I/O is an OS-provided feature that maps the contents of a file on secondary storage into a program’s address space. The program then accesses pages via pointers as if the file resided entirely in memory. The OS transparently loads pages only when the program references them and automatically evicts pages if memory fills up. The advantage of use mmap is that it bypasses the step where we need to copy the pages from secondary to primary storage, thereby making it faster.
How is memory-mapped I/O implemented in SQLite?
SQLite accesses and updates database files using xRead()
and xWrite()
methods by default. These methods are typically implemented as read()
and
write()
system calls which causes the OS to copy disk content between the
kernel buffer cache and user space. SQLite also has the option of accessing disk
content directly using memory-mapped I/O via the xFetch()
and xUnfetch()
methods. Using the legacy xRead()
method in SQLite, a page-sized heap memory
block is allocated, and the xRead()
call copies the entire database page
content into this allocated memory. Whereas if memory mapped I/O is enabled, it
calls the xFetch()
method. The xFetch()
method asks the operating system to
return a pointer to the requested page. If the requested page has been or can be
mapped into the application address space, then xFetch()
returns a pointer to
that page for SQLite to use without having to copy anything. Skipping the copy
step is what makes memory mapped I/O faster.
The mmap_size
is the maximum number of bytes of the database file that SQLite
will try to map into the process address space at one time. Now it is set to
128MB.
With these changes, there is a considerable improvement in performance. That being said, SQLite makes a strong case for single-node production applications, as it is highly performant, especially when used in conjunction with NVMe disks.
Please check out this pull request for more details.
If this blog was helpful, check out our full blog archive.