---
title: "Rex, Rexical and Rails routing"
description:
  "Rails routing uses its own grammar. To better understand how grammar works we
  will look at Rex and Rexical."
canonical_url: "https://www.bigbinary.com/blog/rex-rexical-and-rails-routing"
markdown_url: "https://www.bigbinary.com/blog/rex-rexical-and-rails-routing.md"
---

# Rex, Rexical and Rails routing

Rails routing uses its own grammar. To better understand how grammar works we
will look at Rex and Rexical.

- Author: Neeraj Singh
- Published: February 1, 2013
- Categories: Rails

_Please read [Journey into Rails routing](journey-into-rails-routing) to get a
background on Rails routing discussion._

## A new language

Let's say that the route definition looks like this.

```plaintext
/page/:id(/:action)(.:format)
```

The task at hand is to develop a new programming language which will understand
the rules of the route definitions. Since this language deals with `routes`
let's call this language `Poutes` . Well `Pout` sounds better so let's roll with
that.

## It all begins with scanner

[rexical](https://github.com/tenderlove/rexical) is a gem which generates
scanner generator. Notice that `rexical` is not a scanner itself. It will
generate a scanner for the given rules. Let's give it a try.

Create a folder called `pout_language` and in that folder create a file called
`pout_scanner.rex` . Notice that the extension of the file is `.rex` .

```ruby
class PoutScanner
end
```

Before we proceed any further, let's compile to make sure it works.

```plaintext
$ gem install rexical
$ rex pout_scanner.rex -o pout_scanner.rb
$ ls
pout_scanner.rb pout_scanner.rex
```

While doing gem install do not do `gem install rex` . We are installing gem
called `rexical` not `rex` .

## Time to add rules

Now it's time to add rules to our `pout.rex` file.

Let's try to develop scanner which can detect difference between integers and
strings .

```plaintext
class PoutScanner
rule
  \d+         { puts "Detected number" }
  [a-zA-Z]+   { puts "Detected string" }
end
```

Regenerate the scanner .

```plaintext
$ rex pout_scanner.rex -o pout_scanner.rb
```

Now let's put the scanner to test . Let's create `pout.rb` .

```ruby
require './pout_scanner.rb'
class Pout
  @scanner = PoutScanner.new
  @scanner.tokenize("123")
end
```

You will get the error `undefined method `tokenize' for
#<PoutScanner:0x007f9630837980> (NoMethodError)` .

To fix this error open `pout_scanner.rex` and add inner section like this .

```ruby
class PoutScanner
rule
  \d+         { puts "Detected number" }
  [a-zA-Z]+   { puts "Detected string" }

inner
  def tokenize(code)
    scan_setup(code)
    tokens = []
    while token = next_token
      tokens << token
    end
    tokens
  end
end
```

Regenerate the scanner by executing `rex pout_scanner.rex -o pout_scanner.rb` .
Now let's try to run `pout.rb` file.

```ruby
$ ruby pout.rb
Detected number
```

So this time we got some result.

Now let's test for a string .

```ruby
 require './pout_scanner.rb'

class Pout
  @scanner = PoutScanner.new
  @scanner.tokenize("hello")
end

$ ruby pout.rb
Detected string
```

So the scanner is rightly identifying string vs integer. We are going to add a
lot more testing so let's create a test file so that we do not have to keep
changing the `pout.rb` file.

## Tests and Rake file

This is our `pout_test.rb` file.

```ruby
require 'test/unit'
require './pout_scanner'

class PoutTest  < Test::Unit::TestCase
  def setup
    @scanner = PoutScanner.new
  end

  def test_standalone_string
    assert_equal [[:STRING, 'hello']], @scanner.tokenize("hello")
  end
end
```

And this is our `Rakefile` file .

```ruby
require 'rake'
require 'rake/testtask'

task :generate_scanner do
  `rex pout_scanner.rex -o pout_scanner.rb`
end

task :default => [:generate_scanner, :test_units]

desc "Run basic tests"
Rake::TestTask.new("test_units") { |t|
  t.pattern = '*_test.rb'
  t.verbose = true
  t.warning = true
}
```

Also let's change the `pout_scanner.rex` file to return an array instead of
`puts` statements . The array contains information about what type of element it
is and the value .

```ruby
class PoutScanner
rule
  \d+         { [:INTEGER, text.to_i] }
  [a-zA-Z]+   { [:STRING, text] }

inner
  def tokenize(code)
    scan_setup(code)
    tokens = []
    while token = next_token
      tokens << token
    end
    tokens
  end
end
```

With all this setup now all we need to do is write test and run `rake` .

## Tests for integer

I added following test and it passed.

```ruby
def test_standalone_integer
  assert_equal [[:INTEGER, 123]], @scanner.tokenize("123")
end
```

However following test failed .

```ruby
def test_string_and_integer
  assert_equal [[:STRING, 'hello'], [:INTEGER, 123]], @scanner.tokenize("hello 123")
end
```

Test is failing with following message

```plaintext
  1) Error:
test_string_and_integer(PoutTest):
PoutScanner::ScanError: can not match: ' 123'
```

Notice that in the error message before 123 there is a space. So the scanner
does not know how to handle space. Let's fix that.

Here is the updated rule. We do not want any action to be taken when a space is
detected. Now test is passing .

```ruby
class PoutScanner
rule
  \s+
  \d+         { [:INTEGER, text.to_i] }
  [a-zA-Z]+   { [:STRING, text] }

inner
  def tokenize(code)
    scan_setup(code)
    tokens = []
    while token = next_token
      tokens << token
    end
    tokens
  end
end
```

## Back to routing business

Now that we have some background on how scanning works let's get back to
business at hand. The task is to properly parse a routing statement like
`/page/:id(/:action)(.:format)` .

## Test for slash

The simplest route is one with `/` . Let's write a test and then rule for it.

```ruby
require 'test/unit'
require './pout_scanner'

class PoutTest  < Test::Unit::TestCase
  def setup
    @scanner = PoutScanner.new
  end

  def test_just_slash
    assert_equal [[:SLASH, '/']], @scanner.tokenize("/")
  end

end
```

And here is the `.rex` file .

```ruby
class PoutScanner
rule
  \/         { [:SLASH, text] }

inner
  def tokenize(code)
    scan_setup(code)
    tokens = []
    while token = next_token
      tokens << token
    end
    tokens
  end
end
```

## Test for /page

Here is the test for `/page` .

```ruby
def test_slash_and_literal
  assert_equal [[:SLASH, '/'], [:LITERAL, 'page']] , @scanner.tokenize("/page")
end
```

And here is the rule that was added .

```ruby
 [a-zA-Z]+  { [:LITERAL, text] }
```

### Test for /:page

Here is test for `/:page` .

```ruby
def test_slash_and_symbol
  assert_equal [[:SLASH, '/'], [:SYMBOL, ':page']] , @scanner.tokenize("/:page")
end
```

And here are the rules .

```ruby
rule
  \/          { [:SLASH, text]   }
  \:[a-zA-Z]+ { [:SYMBOL, text]  }
  [a-zA-Z]+   { [:LITERAL, text] }
```

## Test for /(:page)

Here is test for `/(:page)` .

```ruby
def test_symbol_with_paran
  assert_equal  [[[:SLASH, '/'], [:LPAREN, '('],  [:SYMBOL, ':page'], [:RPAREN, ')']]] , @scanner.tokenize("/(:page)")
end
```

And here is the new rule

```ruby
  \/\(\:[a-z]+\) { [ [:SLASH, '/'], [:LPAREN, '('], [:SYMBOL, text[2..-2]], [:RPAREN, ')']] }
```

We'll stop here and will look at the final set of files

## Final files

This is `Rakefile` .

```ruby
require 'rake'
require 'rake/testtask'

task :generate_scanner do
  `rex pout_scanner.rex -o pout_scanner.rb`
end

task :default => [:generate_scanner, :test_units]

desc "Run basic tests"
Rake::TestTask.new("test_units") { |t|
  t.pattern = '*_test.rb'
  t.verbose = true
  t.warning = true
}
```

This is `pout_scanner.rex` .

```ruby
class PoutScanner
rule
  \/\(\:[a-z]+\) { [ [:SLASH, '/'], [:LPAREN, '('], [:SYMBOL, text[2..-2]], [:RPAREN, ')']] }
  \/          { [:SLASH, text]   }
  \:[a-zA-Z]+ { [:SYMBOL, text]  }
  [a-zA-Z]+   { [:LITERAL, text] }

inner
  def tokenize(code)
    scan_setup(code)
    tokens = []
    while token = next_token
      tokens << token
    end
    tokens
  end
end
```

This is `pout_test.rb` .

```ruby
require 'test/unit'
require './pout_scanner'

class PoutTest  < Test::Unit::TestCase
  def setup
    @scanner = PoutScanner.new
  end

  def test_just_slash
    assert_equal [[:SLASH, '/']] , @scanner.tokenize("/")
  end

  def test_slash_and_literal
    assert_equal [[:SLASH, '/'], [:LITERAL, 'page']] , @scanner.tokenize("/page")
  end

  def test_slash_and_symbol
    assert_equal [[:SLASH, '/'], [:SYMBOL, ':page']] , @scanner.tokenize("/:page")
  end

  def test_symbol_with_paran
    assert_equal  [[[:SLASH, '/'], [:LPAREN, '('],  [:SYMBOL, ':page'], [:RPAREN, ')']]] , @scanner.tokenize("/(:page)")
  end
end
```

## How scanner works

Here we used `rex` to generate the scanner. Now take a look that the
`pout_scanner.rb` . Here is [that file](https://gist.github.com/4672018) .
Please take a look at this file and study the code. It is only 91 lines of code.

If you look at the code it is clear that scanning is not that hard. You can hand
roll it without using a tool like `rex` . And that's exactly what Aaron
Patternson did in [Journey](http://github.com/rails/journey) . He hand rolled
the
[scanner](https://github.com/rails/journey/blob/master/lib/journey/scanner.rb) .

## Conclusion

In this blog we saw how to use `rex` to build the scanner to read our routing
statements . In the next blog we'll see how to parse the routing statement and
how to find the matching routing statement for a given url .

## Links

- [Human page](https://www.bigbinary.com/blog/rex-rexical-and-rails-routing)
