---
title: "Rails 6 adds support for Optimizer Hints"
description: "Rails 6 has added support for Optimizer Hints"
canonical_url: "https://www.bigbinary.com/blog/rails-6-supports-optimizer-hints"
markdown_url: "https://www.bigbinary.com/blog/rails-6-supports-optimizer-hints.md"
---

# Rails 6 adds support for Optimizer Hints

Rails 6 has added support for Optimizer Hints

- Author: Vishal Telangre
- Published: July 30, 2019
- Categories: Rails 6, Rails

Rails 6 has added support to provide optimizer hints.

## What is Optimizer Hints?

Many relational database management systems (RDBMS) have a query optimizer. The
job of the query optimizer is to determine the most efficient and fast plan to
execute a given SQL query. Query optimizer has to consider all possible query
execution plans before it can determine which plan is the optimal plan for
executing the given SQL query and then compile and execute that query.

An optimal plan is chosen by the query optimizer by calculating the cost of each
possible plans. Typically, when the number of tables referenced in a join query
increases, then the time spent in query optimization grows exponentially which
often affects the system's performance. The fewer the execution plans the query
optimizer needs to evaluate, the lesser time is spent in compiling and executing
the query.

As an application designer, we might have more context about the data stored in
our database. With the contextual knowledge about our database, we might be able
to choose a more efficient execution plan than the query optimizer.

This is where the optimizer hints or optimizer guidelines come into picture.

Optimizer hints allow us to control the query optimizer to choose a certain
query execution plan based on the specific criteria. In other words, we can hint
the optimizer to use or ignore certain optimization plans using optimizer hints.

Usually, optimizer hints should be provided only when executing a complex query
involving multiple table joins.

Note that the optimizer hints only affect an individual SQL statement. To alter
the optimization strategies at the global level, there are different mechanisms
supported by different databases. Optimizer hints provide finer control over
other mechanisms which allow altering optimization plans by other means.

Optimizer hints are supported by many databases such as
[MySQL](https://dev.mysql.com/doc/refman/8.0/en/optimizer-hints.html),
[PostgreSQL with the help of `pg_hint_plan` extension](https://pghintplan.osdn.jp/pg_hint_plan.html),
[Oracle](https://docs.oracle.com/en/database/oracle/oracle-database/12.2/tgsql/influencing-the-optimizer.html),
[MS SQL](https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-2017),
[IBM DB2](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.perf.doc/doc/c0070117.html),
etc. with varying syntax and options.

## Optimizer Hints in Rails 6

Before Rails 6, we have to execute a raw SQL query to use the optimizer hints.

```ruby
query = "SELECT
            /*+ JOIN_ORDER(articles, users) MAX_EXECUTION_TIME(60000) */
            articles.*
         FROM articles
         INNER JOIN users
         ON users.id = articles.user_id
         WHERE (published_at > '2019-02-17 13:15:44')
        ".squish

ActiveRecord::Base.connection.execute(query)
```

In the above query, we provided two optimizer hints to MySQL .

```plaintext
/*+ HINT_HERE ANOTHER_HINT_HERE ... */
```

Another approach to use optimizer hints prior to Rails 6 is to
[use a monkey patch like this](https://gist.github.com/kamipo/4c8539f0ce4acf85075cf5a6b0d9712e).

In Rails 6, using optimizer hints is easier.

The same example looks like this in Rails 6.

```ruby
Article
  .joins(:user)
  .where("published_at > ?", 2.months.ago)
  .optimizer_hints(
    "JOIN_ORDER(articles, users)",
    "MAX_EXECUTION_TIME(60000)"
  )
```

This produces the same SQL query as above but the result is of type
`ActiveRecord::Relation`.

In PostgreSQL (using the `pg_hint_plan` extension), the optimizer hints have a
different syntax.

```ruby
Article
  .joins(:user)
  .where("published_at > ?", 2.months.ago)
  .optimizer_hints("Leading(articles users)", "SeqScan(articles)")
```

Please checkout the documentation of each database separately to learn the
support and syntax of optimizer hints.

To learn more, please
[checkout this PR](https://github.com/rails/rails/pull/35615) which introduced
the `#optimization_hints` method to Rails 6.

## Bonus example: Using optimizer hints to speedup a slow SQL statement in MySQL

Consider that we have `articles` table with some indexes.

```ruby
class CreateArticles < ActiveRecord::Migration[6.0]
  def change
    create_table :articles do |t|
      t.string :title, null: false
      t.string :slug, null: false
      t.references :user
      t.datetime :published_at
      t.text :description

      t.timestamps

      t.index :slug, unique: true
      t.index [:published_at]
      t.index [:slug, :user_id]
      t.index [:published_at, :user_id]
      t.index [:title, :slug]
    end
  end
end
```

Let's try to fetch all the articles which have been published in the last 2
months.

```ruby
>> Article.joins(:user).where("published_at > ?", 2.months.ago)
# Article Load (10.5ms)  SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:38:18.647296')
=> #<ActiveRecord::Relation [#<Article id: 20, title: "Article 20", slug: "article-20", user_id: 1, ...]>
```

Let's use `EXPLAIN` to investigate why it is taking 10.5ms to execute this
query.

```ruby
>> Article.joins(:user).where("published_at > ?", 2.months.ago).explain
# Article Load (13.9ms)  SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:39:05.380577')
=> # EXPLAIN for: SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:39:05.380577')
# +--------+----------+----------------+-----------+------+----------+-------+
# | select |   table  | possible_keys  | key       | rows | filtered | Extra |
# | _type  |          |                |           |      |          |       |
# +--------+----------+----------------+-----------+------+----------+-------+
# | SIMPLE |   users  | PRIMARY        | PRIMARY   | 2    | 100.0    | Using |
# |        |          |                |           |      |          | index |
# +--------+----------+----------------+-----------+------+----------+-------+
# | SIMPLE | articles | index          | index     | 9866 | 10.0     | Using |
# |        |          | _articles      | _articles |      |          | where |
# |        |          | _on_user_id,   | _on       |      |          |       |
# |        |          | index          | _user_id  |      |          |       |
# |        |          | _articles      |           |      |          |       |
# |        |          | _on            |           |      |          |       |
# |        |          | _published_at, |           |      |          |       |
# |        |          | index          |           |      |          |       |
# |        |          | _articles      |           |      |          |       |
# |        |          | _on            |           |      |          |       |
# |        |          | _published_at  |           |      |          |       |
# |        |          | _and_user_id   |           |      |          |       |
# +--------+----------+----------------+-----------+------+----------+-------+
```

According to the above table, it appears that the query optimizer is considering
`users` table first and then the `articles` table.

The `rows` column indicates the estimated number of rows the query optimizer
must examine to execute the query.

The
[`filtered`](https://dev.mysql.com/doc/refman/8.0/en/explain-output.html#explain_filtered)
column indicates an estimated percentage of table rows that will be filtered by
the table condition.

The formula `rows x filtered` gives the number of rows that will be joined with
the following table.

Also,

- For `users` table, the number of rows to be joined with the following table is
  `2 x 100% = 2`,
- For `articles` table, the number of rows to be joined with the following table
  is `500 * 7.79 = 38.95`.

Since the `articles` tables contain more records which references very few
records from the `users` table, it would be better to consider the `articles`
table first and then the `users` table.

We can hint MySQL to consider the `articles` table first as follows.

```ruby
>> Article.joins(:user).where("published_at > ?", 2.months.ago).optimizer_hints("JOIN_ORDER(articles, users)")
# Article Load (2.2ms)  SELECT `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:54:06.230651')
=> #<ActiveRecord::Relation [#<Article id: 20, title: "Article 20", slug: "article-20", user_id: 1, ...]>
```

Note that it took 2.2ms now to fetch the same records by providing
`JOIN_ORDER(articles, users)` optimization hint.

Let's try to `EXPLAIN` what changed by using this `JOIN_ORDER(articles, users)`
optimization hint.

```ruby
>> Article.joins(:user).where("published_at > ?", 2.months.ago).optimizer_hints("JOIN_ORDER(articles, users)").explain
# Article Load (4.1ms)  SELECT /*+ JOIN_ORDER(articles, users) */ `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:55:24.335152')
=> # EXPLAIN for: SELECT /*+ JOIN_ORDER(articles, users) */ `articles`.* FROM `articles` INNER JOIN `users` ON `users`.`id` = `articles`.`user_id` WHERE (published_at > '2019-02-17 11:55:24.335152')
# +--------+----------+----------------+-----------+------+----------+--------+
# | select |   table  | possible_keys  | key       | rows | filtered | Extra  |
# | _type  |          |                |           |      |          |        |
# +--------+----------+----------------+-----------+------+----------+--------+
# | SIMPLE | articles | index          | index     | 769  | 100.0    | Using  |
# |        |          | _articles      | _articles |      |          | index  |
# |        |          | _on_user_id,   | _on       |      |          | condi  |
# |        |          | index          | _publish  |      |          | tion;  |
# |        |          | _articles      | ed_at,    |      |          | Using  |
# |        |          | _on            |           |      |          | where  |
# |        |          | _published_at, |           |      |          |        |
# |        |          | index          |           |      |          |        |
# |        |          | _articles      |           |      |          |        |
# |        |          | _on            |           |      |          |        |
# |        |          | _published_at  |           |      |          |        |
# |        |          | _and_user_id   |           |      |          |        |
# +--------+----------+----------------+-----------+------+----------+--------+
# | SIMPLE | users    | PRIMARY        | PRIMARY   | 2    | 100.0    | Using  |
# |        |          |                |           |      |          | index  |
# +--------+----------+----------------+-----------+------+----------+--------+
```

The result of the `EXPLAIN` query shows that the `articles` table was considered
first and then the `users` table as expected. We can also see that the
`index_articles_on_published_at` index key was considered from the possible keys
to execute the given query. The `filtered` column for both tables shows that the
number of filtered rows was 100% which means no filtering of rows occurred.

We hope this example helps in understanding how to use `#explain` and
`#optimization_hints` methods in order to investigate and debug the performance
issues and then fixing it.

## Links

- [Human page](https://www.bigbinary.com/blog/rails-6-supports-optimizer-hints)
