I wanted to define equality functions for unit testing purposes. The general pattern is that I’ll call a method that returns a complex data structure. I’ll want to test if the data structure matches the sample data structure in my test. Frequently the definition of Object.equals() won’t test equality in the way I wants to – it will depend on identity. I’ve defined an equality function that generally provides what I need, and can be easily specialised. There are quite a few things to consider, so I thought I’d write this article about it.

It’s well worth reading this guide to Java equality. It explains that the contract for Object.equals() requires that it is reflexive, symmetric, transitive and consistent. If you don’t know what all that means, have a read.

**Design objectives**

– Object identity should be irrelevent (but see graph isomorphism later)

– Implementation should be independent of Object.equals() because tests should not depend on the implementation of equality provided by the classes being tested.

– The Object.equals() contract should be satisfied. This means the function gives intuitive results, and also that it can be used to implement equals.

– Circular data structures can be evaluated. Object.equals() will fail on circular data structures (stack overflow), and I want to be able to compare circular data structures. For now, I’ve decided not to worry about stack overflow in general – if your data structure is too deep, a stack overflow will result. This is a future requirement.

– Easy specialization. It should be easy to add support for classes that need special handling.

– Support for collections, including the more specific equality contracts of the collections interfaces.

I looked at EqualsBuilder. This wasn’t suitable for for me because it uses Object.equals() on member variables. For the same reason it can’t handle circular data structures.

**Object equality**

Two objects are considered equal if they are the same type, and their fields are equal.

**Equivalence classes**

If I say something like ‘Lists are an equivalence class’ I mean Java classes implementing the Java List interface form an equivalence class. The various Java collections classes define equals using equivalence classes, so if I wanted to use my equals to verify that a new class satisfies the List.equals() contract, List must be an equivalence class.

**Collections**

I can’t define Collection equality, since because of the existence of Set and List, there is no useful reflexive definition. Instead I introduced the Bag interface.

Bag

I needed Bag semantics for equality – no dependence on order, but items can occur more than once, and equality requires the same number of occurrences of each. Java doesn’t have a Bag interface. I’ve ended up creating a Bag interface, and AbstractBag with an appropriate implementation of Object.equals(Object), as well as a Bag wrapper around a list.

So bag is an equivalence class. Bag equality is O(N^2).

List

Lists are an equivalence class. List equality is as required by List.equals(Object), and is O(N)

Set

Sets are another equivalence class.

I can’t use Set.contains(Object), since it uses Object.equals(Object) on its members. Actually, a java.util.Set might not be a set from our point of view at all.

There are several ways forward here:

1. Treat java.util.Set as Bag. If both arguments are in fact sets, this gives the correct answer. If they aren’t sets, it still seems to give a sensible answer.

2. Assume the set is a set. This makes sense if we reguard our definition of equals as being more restrictive than Object.equals, but that’s not true for the default implementation of Object.equals().

3. Detect when the arguments are not sets, and throw an exception

Anyway, for now, I’m going with 1.

Note: If you implement your own Set, using a comparator, watch out for AbstractSet.removeAll(Collection) and AbstractSet.containsAll(Collection) which may or may not use your comparator, depending on the size of the Collection argument!

SortedSet

SortedSet isn’t an equivalence class – we don’t do anything special for SortedSet. That’s because it uses the same definition of equals as Set. That is, SortedSet equality doesn’t depend on the Comparator. The API docs are careful to state that if Comparator for a set does not agree with equals of its members, then it does not satisfy the Set contract.

Map

Maps are an equivalence class. Map equality uses Java equality for keys. The alternative is to define equality for Map.Entry, and then test the equality of the entry set. This is O(N^2) rather than O(N) and doesn’t really make sense because Map.get(key) might return different results for the same key, even though the maps evaluate as equal.

**Circular data structures**

When testing the equality of two objects, we assume the equality of the objects when testing their members. This allows us to safely test circular data structures.

**Pluggable equality**

You can plug in your own Equality function to tackle your own types, requirements, or standard types I haven’t considered. Perhaps the easiest way to do this is using multi-methods. In fact, I’ve included a base class with the default behaviour described here, which you can subclass, and then use dynamic dispatch to generate your Equality.

**Identity**

This algorithm does not tackle graph isomorphism: Object identity is ignored (except as described in circular data structures above). For an immutable data structures, this is probably irrelevant. For a mutable data structure, there are cases where two objects are evaluated as equal, but applying the same operation to them might result in the objects not evaluating as equal. Obviously this is a problem worth addressing for Java in the future.

**Get the code**

There result of all this can be found in the LShift Java library.

## Comment

You may notice I didn’t mention arrays anywhere above. I’ve only just realized, and added support in the code. Get the new version here.