---
title: "Solr, Sunspot, Websolr and Delayed job"
description:
  "Handling Solr with Delayed job has a few gotchas which we are going to look
  into in this blog."
canonical_url: "https://www.bigbinary.com/blog/solr-sunspot-websolr-delayed-job"
markdown_url: "https://www.bigbinary.com/blog/solr-sunspot-websolr-delayed-job.md"
---

# Solr, Sunspot, Websolr and Delayed job

Handling Solr with Delayed job has a few gotchas which we are going to look into
in this blog.

- Author: Neeraj Singh
- Published: October 11, 2012
- Categories: Rails

[Solr](http://lucene.apache.org/solr/) is an open source search platform from
Apache. It has a very powerful full-text search capability among other things.

Solr is written in Java. And it runs as a standalone search server within a
servlet container like Tomcat. When you are working on a Ruby on Rails
application you do not want to maintain Tomcat server. This is where
[websolr](https://websolr.com) comes in picture. Websolr manages the index and
the Rails application interacts with index using a gem called
[sunspot-rails](https://github.com/outoftime/sunspot_rails) .

## Getting started

```ruby
# Gemfile
gem 'sunspot_rails', '= 1.3.3' # search feature
```

Here I am interested in searching products.

```ruby
class Product < ActiveRecord::Base
  searchable do
    text :name, boost: 1.5
    text :description
  end
end
```

## Using sunspot gem

```ruby
rails g sunspot_rails:install
```

Above command creates `config/sunspot.yml` file. By default this file looks like
following.

```ruby
production:
  solr:
    hostname: localhost
    port: 8983
    log_level: WARNING

development:
  solr:
    hostname: localhost
    port: 8982
    log_level: INFO

test:
  solr:
    hostname: localhost
    port: 8981
    log_level: WARNING
```

The way sunspot works is that after every single web request it updates solr
about the changes that took place in the request. This is not desirable. To turn
that off add `auto_commit_after_request` option to false in the
`config/sunsunspot.yml` file.

I would also change the `log_level` for development to `DEBUG` . The revised
`config/sunspot.yml` file would look like

```ruby
production:
  solr:
    hostname: localhost
    port: 8983
    log_level: WARNING
    auto_commit_after_request: false

development:
  solr:
    hostname: localhost
    port: 8980
    log_level: DEBUG
    auto_commit_after_request: false

test:
  solr:
    hostname: localhost
    port: 8981
    log_level: DEBUG
    auto_commit_after_request: false
```

## Taking care of callbacks

In the above case anytime I create, update or destroy a product then as part of
`after_save` callback solr commit commands are issued. Since `after_save`
callbacks are part of ActiveRecord transaction, this slows up the create, update
and destroy operation. I like all these operations to happen in background.

Here is how I handled it

```ruby
class Product < ActiveRecord::Base
  searchable do
    text :name, boost: 1.5
    text :description
  end
  handle_asynchronously :solr_index, queue: 'indexing', priority: 50
  handle_asynchronously :solr_index!, queue: 'indexing', priority: 50
  handle_asynchronously :remove_from_index, queue: 'indexing', priority: 50
end
```

In the above case I used
[Delayed Job](https://github.com/collectiveidea/delayed_job) but you can use any
background job processing tool.

In case of Delayed Job the higher the priority value the less is the priority.
By bumping the priority value to 50, I'm making sure that emails and other
background jobs are processed before solr work is taken up.

## Problem with `remove_from_index`

In the above case the call to `remove_from_index` has been deferred to Delayed
Job. However the record has already been destroyed. So when Delayed Job takes up
the work it first tries to retrieve the record. However the record is missing
and the background job fails.

Here is how we solved this problem.

```ruby
class Product < ActiveRecord::Base
  searchable do
    text :name, boost: 1.5
    text :description
  end
  handle_asynchronously :solr_index, queue: 'indexing', priority: 50
  handle_asynchronously :solr_index!, queue: 'indexing', priority: 50

  def remove_from_index_with_delayed
    Delayed::Job.enqueue RemoveIndexJob.new(record_class: self.class.to_s, attributes: self.attributes), queue: 'indexing', priority: 50
  end
  alias_method_chain :remove_from_index, :delayed
end
```

Add another worker named `remove_index.rb` .

```ruby
class RemoveIndexJob < Struct.new(:options)
  def perform
    return if options.nil?
    options.symbolize_keys!
    record = options[:record_class].constantize.new options[:attributes].except("id")
    record.id = options[:attributes]["id"]
    record.remove_from_index_without_delayed
  end
end
```

## Connecting to websolr

From the websolr documentation it was not clear that the sunspot gem first looks
for an environment variable called `WEBSOLR_URL` and if that environment
variable has a value then sunspot assumes that the solr index is at that url. If
no value is found then it assumes that it is dealing with local solr instance.

So if you are using websolr then make sure that your application has environment
variable `WEBSOLR_URL` properly configured in staging and in production
environment.

## Links

- [Human page](https://www.bigbinary.com/blog/solr-sunspot-websolr-delayed-job)
