2.2 Equality of Objects

The equality operator is a method

It is easy to check for equality for standard Ruby objects like strings, numbers, arrays and hashes. Some quick examples:

Arrays, string and Hashes are standard Ruby data structures the equality operator == is already implemented by Ruby for them. But how do you check for equality for objects that you define? Take a look at this simple Item class:

That is the wrong answer! Both objects have exactly the same state and behaviour (since they belong to the same class) and should have been treated as identical objects.

This is simple to fix. In Ruby, all binary operators (those which have two operands) including == are actually methods that gets invoked on the parameter on the left-hand side of the operator. In practice that means a == b is the same as a.==(b).

Do not believe it yet? Code speaks louder than words:

Example Code:

[reset]

Output Window

The == method which we just defined always return false. Now fix it to return true if the item_name and qty of your object is the same as that of the object being compared with.

[reset]

Output Window

Note that you can override almost every operator like this. For instance, if you need to be able to add two Items that have the same item name, you can implement the + operator on the Item class which returns a new Item object that holds the combined quantity of both the Items.

Object equality, the eql? method and hash codes

But wait! There is more to object equality. Even though overriding == worked for simple equality comparisons, there are some cases where that isn't just enough.

In the following example, we build an array of duplicate Item objects and apply uniq on it. See what happens to the uniq:

Example Code:

[reset]

Output Window

We expected Array#uniq to return only one element since the rest were duplicates; but it returned everything. Clearly, #uniq did not work. We did override the == method to return true if the items are identical and we verified that it works by comparing an Item to its clone. So, what went wrong?

The short answer is that we failed to implement two other methods that are crucial to get object equality correct: the eql? and hash methods. Why do we need these two over and above the simple == ?

There are a lot of operations in Ruby that need to check the equality of two objects. While == serves the purpose well, it is not really fast. For operations that might involve large number of equality checks (like Array#uniq and Hash lookups), the speed disadvantage adds up and becomes an overhead. To get around this, Ruby provides a hash method with every object. It returns a numeric value which is usually unique to every object.

In the following example, we print the hash values for different objects. Take a look:

Do not confuse the method hash, which returns a hash code, with the data structure Hash. A hash code of an object is usually a short (and in Ruby, always numeric) identifier of an object. Hash is a data structure that uses the hash code of objects for fast key lookup and thus derives the name.

So instead of comparing two objects using ==, which could be expensive when the objects are large, Ruby uses the hash of the object when possible. Being a simple numeric value, this comparison is almost always faster than comparing the various instance variables of the underlying object.

The Array#uniq method, as you might have guessed, uses the result of hash to compare objects and identify duplicates. Let us see how this works out in practice:

Example Code:

[reset]

Output Window

Array#uniq now works correctly for the item object. This is because we implemented two methods: hash and eql?.

What is the hash method doing? The ^ operator used is the binary XOR. The hash method returns the result of XORing all the instance variables that determine the state of the object. This ensures that whenever the state of the object changes, the hash code as well changes. Distinct hash codes for distinct objects is an extremely desirable property of hash codes through which operations on collections become faster.

We also introduced the eql? method in the above example. In fact it was called by Array#uniq twice to check the equality of the elements of the array. Even though we use == to check for equality of objects, routines like Array#uniq uses the eql? instead. This means that we must implement the eql? method as well whenever we override ==. In most cases, these two methods will be identical, so you can implement the actual comparison in one method and have the other method just call it.

To summarize, if you ever override any of the ==, eql? or the hash method, you must override the others as well.

Wrapping up Object Equality in Ruby

Here is the final exercise in this lesson. I have an Item class which stores the item name, quantity and price. You have to implement the equality methods for this object. Remember, you have to:

Define a == method that compares the state of your object with that of the other one and returns a boolean value.
Define a eql? method that simply calls the == to do the actual comparison.
Define a hash method that returns the result of XORing (using the ^ operator) the hash of all that instance variables which together determine the state of the object.

Go forth, brave soldier!

[reset] Need a hint? See the Solution

Hint

You need to create getter methods for all the attributes to compare the values in the current object with the other object.

Output Window

Congratulations, guest!

% of the book completed