What you didn't think you needed to know about hashCode and equals
This article is a repost of my comments to the question on how to implement hashCode and equals on stackoverflow
There are some issues worth noticing if you’re implementing hashCode
and equals
for classes that are persisted using an Object-Relationship Mapper (ORM) like Hibernate. If you didn’t think this topic is stupidly overcomplicated already!
Lazy loaded objects are subclasses
If your objects are persisted using an ORM, in many cases you will be dealing with dynamic proxies to avoid loading object too early from the data store. These proxies are implemented as subclasses of your own class. This means that the commonly recommended this.getClass() == o.getClass()
will return false. For example:
Person saved = new Person("John Doe");
Long key = dao.save(saved);
dao.flush();
Person retrieved = dao.retrieve(key);
saved.getClass().equals(retrieved.getClass());
// Will return false if Person is loaded lazy
If you’re dealing with an ORM using o instanceof Person
is the only thing that will behave correctly.
Lazy loaded objects have null-fields
ORMs usually use the getters to force loading of lazy loaded objects. This means that person.name
will be null
if person is lazy loaded, even after person.getName()
forces loading and returns “John Doe”. In my experience, this crops up often in hashCode
and equals
.
If you’re dealing with an ORM, make sure to always use getters, and never field references in hashCode
and equals
.
Saving an object will change it’s state
Persistent objects often use a id field to hold the key of the object. This field will be automatically updated when an object is first saved. Don’t use an id
field in hashCode
. But you can use it in equals.
A pattern I often use is
if (this.getId() == null) {
return this == other;
} else {
return this.getId() == other.getId();
}
But: You cannot include getId()
in hashCode()
. If you do, when an object is persisted, it’s hashCode
changes. If the object is in a HashSet
, you’ll “never” find it again.
In my Person example, I probably would use getName()
for hashCode
and getId()
plus getName()
(just for paranoia) for equals. It’s okay if there are some risk of “collisions” for hashCode, but never okay for equals.
hashCode should use the non-changing subset of properties from equals
Comments:
[jakob] - Nov 3, 2008
rzei raises a valid point regarding changing fields. What is your advice regarding the implementation of hashcode()? Is the only way to compute and persist some kind of unchangeable UID for each object?
In a framework I have used, persistent keys were (wastefully) created with the object, so even if the object would never be persisted, it still had a PK. So one could use the PK in hashcode implementations.
jhannes - Nov 3, 2008
I’ve never had much luck with the precreated keys. But then again, I use the convention of a null-key to indicate unsaved objects, so I’d run into other problems. FWIW, it sounds like your approach would behave correctly.
jhannes - Nov 3, 2008
Hi, rzei
The example wasn’t very clear. I’ve run into the the problem in the following situations: I have a Parent object that includes many Children in a HashSet. I construct a new Parent and add Children to it, then I persist the Parent for the first time, and the HashSet is configured persisted as well, through cascading.
The result is that the Child objects are given an id, their hashCode change, and I can no longer find them.
Hope this was clearer?
[rzei] - Nov 3, 2008
From a Hibernate users standpoint, could you give an example of a situation where you need to store a hashset of objects of which some are persisted, some get persisted why stored in set and some do not?
If you declare that id shouldn’t be used in hashCode; in the same way you could say that you cannot use any fields in hashCode, as every one of those can change.
Surrogate should be the most stable property of an persist entity as it changes only once.
Great post though, too few have been written on the subject.
[Tomas] - Nov 3, 2008
I agree that too few posts have been written on this subject. I’ve come into problems with equals/hashCode and persistance serveral times, but I haven’t found very much useful info on the topic.
Johannes Brodwall - Nov 4, 2008
I’ve never had much luck with the precreated keys. But then again, I use the convention of a null-key to indicate unsaved objects, so I’d run into other problems. FWIW, it sounds like your approach would behave correctly.