This blog is part of our Ruby 2.4 series.
In Ruby, we commonly use uniq method on an array to fetch the collection of all unique elements. But there may be cases where we might need elements in a hash by virtue of uniqueness of its values.
Let's consider an example of countries that have hosted the Olympics. We only want to know when was the first time a country hosted it.
1 2# given object 3{ 1896 => 'Athens', 4 1900 => 'Paris', 5 1904 => 'Chicago', 6 1906 => 'Athens', 7 1908 => 'Rome' } 8 9# expected outcome 10{ 1896 => 'Athens', 11 1900 => 'Paris', 12 1904 => 'Chicago', 13 1908 => 'Rome' } 14
One way to achieve this is to have a collection of unique country names and then check if that value is already taken while building the result.
1 2olympics = 3{ 1896 => 'Athens', 4 1900 => 'Paris', 5 1904 => 'Chicago', 6 1906 => 'Athens', 7 1908 => 'Rome' } 8 9unique_nations = olympics.values.uniq 10 11olympics.select{ |year, country| !unique_nations.delete(country).nil? } 12#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"} 13
As we can see, the above code requires constructing an additional array unique_nations.
In processing larger data, loading an array of considerably big size in memory and then carrying out further processing on it, may result in performance and memory issues.
In Ruby 2.4, Enumerable class introduces uniq method that collects unique elements while iterating over the enumerable object.
The usage is similar to that of Array#uniq. Uniqueness can be determined by the elements themselves or by a value yielded by the block passed to the uniq method.
1 2olympics = {1896 => 'Athens', 1900 => 'Paris', 1904 => 'Chicago', 1906 => 'Athens', 1908 => 'Rome'} 3 4olympics.uniq { |year, country| country }.to_h 5#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"} 6
Similar method is also implemented in Enumerable::Lazy class. Hence we can now call uniq on lazy enumerables.
1 2(1..Float::INFINITY).lazy.uniq { |x| (x**2) % 10 }.first(6) 3#=> [1, 2, 3, 4, 5, 10] 4