March 21, 2017
This blog is part of our Ruby 2.4 series.
In Ruby, we commonly use uniq
method on an array to fetch the collection of
all unique elements. But there may be cases where we might need elements in a
hash by virtue of uniqueness of its values.
Let's consider an example of countries that have hosted the Olympics. We only want to know when was the first time a country hosted it.
# given object
{ 1896 => 'Athens',
1900 => 'Paris',
1904 => 'Chicago',
1906 => 'Athens',
1908 => 'Rome' }
# expected outcome
{ 1896 => 'Athens',
1900 => 'Paris',
1904 => 'Chicago',
1908 => 'Rome' }
One way to achieve this is to have a collection of unique country names and then check if that value is already taken while building the result.
olympics =
{ 1896 => 'Athens',
1900 => 'Paris',
1904 => 'Chicago',
1906 => 'Athens',
1908 => 'Rome' }
unique_nations = olympics.values.uniq
olympics.select{ |year, country| !unique_nations.delete(country).nil? }
#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"}
As we can see, the above code requires constructing an additional array
unique_nations
.
In processing larger data, loading an array of considerably big size in memory and then carrying out further processing on it, may result in performance and memory issues.
In Ruby 2.4, Enumerable
class introduces uniq
method that collects unique elements
while iterating over the enumerable object.
The usage is similar to that of Array#uniq. Uniqueness can be determined by the
elements themselves or by a value yielded by the block passed to the uniq
method.
olympics = {1896 => 'Athens', 1900 => 'Paris', 1904 => 'Chicago', 1906 => 'Athens', 1908 => 'Rome'}
olympics.uniq { |year, country| country }.to_h
#=> {1896=>"Athens", 1900=>"Paris", 1904=>"Chicago", 1908=>"Rome"}
Similar method is also implemented in Enumerable::Lazy
class. Hence we can now
call uniq
on lazy enumerables.
(1..Float::INFINITY).lazy.uniq { |x| (x**2) % 10 }.first(6)
#=> [1, 2, 3, 4, 5, 10]
If this blog was helpful, check out our full blog archive.