这里有一些关于JPA实体的讨论,以及应该为JPA实体类使用哪些hashCode()/equals()实现。它们中的大多数(如果不是全部)依赖于Hibernate,但是我想中立地讨论它们的jpa实现(顺便说一下,我使用的是EclipseLink)。

所有可能的实现都有其自身的优点和缺点:

hashCode()/equals()契约一致性(不可变性)用于列表/集操作 是否可以检测到相同的对象(例如来自不同会话的对象,来自惰性加载数据结构的动态代理) 实体在分离(或非持久化)状态下是否正确运行

在我看来,有三种选择:

Do not override them; rely on Object.equals() and Object.hashCode() hashCode()/equals() work cannot identify identical objects, problems with dynamic proxies no problems with detached entities Override them, based on the primary key hashCode()/equals() are broken correct identity (for all managed entities) problems with detached entities Override them, based on the Business-Id (non-primary key fields; what about foreign keys?) hashCode()/equals() are broken correct identity (for all managed entities) no problems with detached entities

我的问题是:

我是否错过了一个选择和/或赞成/反对的观点? 你选择了什么,为什么?

更新1:

通过“hashCode()/equals()是坏的”,我的意思是连续的hashCode()调用可能返回不同的值,这(当正确实现时)在对象API文档的意义上不是坏的,但是当试图从Map、Set或其他基于哈希的集合中检索更改的实体时,会导致问题。因此,JPA实现(至少是EclipseLink)在某些情况下不能正确工作。

更新2:

谢谢你的回答——大部分问题都很有质量。 不幸的是,我仍然不确定哪种方法最适合实际应用程序,或者如何确定最适合我的应用程序的方法。所以,我将保持这个问题的开放性,希望有更多的讨论和/或意见。


当前回答

Jakarta Persistence 3.0,第4.12节写道:

相同抽象模式类型的两个实体当且仅当它们具有相同的主键值时相等。

我看不出为什么Java代码的行为应该有所不同。

If the entity class is in a so called "transient" state, i.e. it's not yet persisted and it has no identifier, then the hashCode/equals methods can not return a value, they ought to blow up, ideally implicitly with a NullPointerException when the method attempts to traverse the ID. Either way, this will effectively stop application code from putting a non-managed entity into a hash-based data structure. In fact, why not go one step further and blow up if the class and identifier are equal, but other important attributes such as the version are unequal (IllegalStateException)! Fail-fast in a deterministic way is always the preferred option.

警告:也要记录下爆发行为。文档本身很重要,但它也希望能够阻止初级开发人员在未来对您的代码做一些愚蠢的事情(他们倾向于压制发生NullPointerException的地方,他们最不关心的是副作用,lol)。

哦,总是使用getClass()而不是instanceof。equals方法要求对称性。如果b等于a,那么a必须等于b。对于子类,instanceof打破了这种关系(a不是b的实例)。

尽管我个人总是使用getClass(),即使在实现非实体类(类型是状态,所以子类添加状态,即使子类是空的或只包含行为),只有当类是final时,instanceof才会很好。但实体类必须不是最终的(§2.1),所以我们真的别无选择。

Some folks may not like getClass(), because of the persistence provider's proxy wrapping the object. This might have been a problem in the past, but it really shouldn't be. A provider not returning different proxy classes for different entities, well, I'd say that's not a very smart provider lol. Generally, we shouldn't solve a problem until there is a problem. And, it seems like Hibernate's own documentation doesn't even see it worthwhile mentioning. In fact, they elegantly use getClass() in their own examples (see this).

Lastly, if one has an entity subclass that is an entity, and the inheritance mapping strategy used is not the default ("single table"), but configured to be a "joined subtype", then the primary key in that subclass table will be the same as the superclass table. If the mapping strategy is "table per concrete class", then the primary key may be the same as in the superclass. An entity subclass is very likely to be adding state and therefore just as likely to be logically a different thing. But an equals implementation using instanceof can not necessarily and secondarily rely on the ID only, as we saw may be the same for different entities.

在我看来,instanceof在非final Java类中根本没有位置。对于持久实体来说尤其如此。

其他回答

如果UUID是许多人的答案,为什么我们不使用业务层的工厂方法来创建实体并在创建时分配主键呢?

例如:

@ManagedBean
public class MyCarFacade {
  public Car createCar(){
    Car car = new Car();
    em.persist(car);
    return car;
  }
}

通过这种方式,我们可以从持久化提供程序获得实体的默认主键,并且我们的hashCode()和equals()函数可以依赖于它。

我们还可以声明Car的构造函数受保护,然后在业务方法中使用反射来访问它们。这样,开发人员就不会打算用new实例化Car,而是通过factory方法。

来说,如何?

虽然使用业务键(选项3)是最常推荐的方法(Hibernate社区wiki,“Java Persistence with Hibernate”第398页),而且这是我们最常用的方法,但Hibernate有一个错误会破坏急于获取的集:HHH-3799。在这种情况下,Hibernate可以在字段初始化之前将一个实体添加到集合中。我不确定为什么这个错误没有得到更多的关注,因为它确实使推荐的业务键方法出现了问题。

我认为问题的核心是equals和hashCode应该基于不可变状态(参考Odersky等人),而具有Hibernate管理的主键的Hibernate实体没有这样的不可变状态。当一个瞬态对象变成持久对象时,Hibernate会修改主键。当Hibernate在初始化过程中为对象补水时,业务键也会被Hibernate修改。

这就只剩下选项1了,基于对象身份继承java.lang.Object实现,或者使用James Brundege在“不要让Hibernate窃取你的身份”(Stijn Geukens的回答已经引用了)和Lance Arlaus在“对象生成:Hibernate集成的更好方法”中建议的应用程序管理的主键。

The biggest problem with option 1 is that detached instances can't be compared with persistent instances using .equals(). But that's OK; the contract of equals and hashCode leaves it up to the developer to decide what equality means for each class. So just let equals and hashCode inherit from Object. If you need to compare a detached instance to a persistent instance, you can create a new method explicitly for that purpose, perhaps boolean sameEntity or boolean dbEquivalent or boolean businessEquals.

我们通常在实体中有两个id:

仅用于持久化层(以便持久化提供程序和数据库能够找出对象之间的关系)。 是为了我们的应用程序需要(特别是equals()和hashCode())

来看看:

@Entity
public class User {

    @Id
    private int id;  // Persistence ID
    private UUID uuid; // Business ID

    // assuming all fields are subject to change
    // If we forbid users change their email or screenName we can use these
    // fields for business ID instead, but generally that's not the case
    private String screenName;
    private String email;

    // I don't put UUID generation in constructor for performance reasons. 
    // I call setUuid() when I create a new entity
    public User() {
    }

    // This method is only called when a brand new entity is added to 
    // persistence context - I add it as a safety net only but it might work 
    // for you. In some cases (say, when I add this entity to some set before 
    // calling em.persist()) setting a UUID might be too late. If I get a log 
    // output it means that I forgot to call setUuid() somewhere.
    @PrePersist
    public void ensureUuid() {
        if (getUuid() == null) {
            log.warn(format("User's UUID wasn't set on time. " 
                + "uuid: %s, name: %s, email: %s",
                getUuid(), getScreenName(), getEmail()));
            setUuid(UUID.randomUUID());
        }
    }

    // equals() and hashCode() rely on non-changing data only. Thus we 
    // guarantee that no matter how field values are changed we won't 
    // lose our entity in hash-based Sets.
    @Override
    public int hashCode() {
        return getUuid().hashCode();
    }

    // Note that I don't use direct field access inside my entity classes and
    // call getters instead. That's because Persistence provider (PP) might
    // want to load entity data lazily. And I don't use 
    //    this.getClass() == other.getClass() 
    // for the same reason. In order to support laziness PP might need to wrap
    // my entity object in some kind of proxy, i.e. subclassing it.
    @Override
    public boolean equals(final Object obj) {
        if (this == obj)
            return true;
        if (!(obj instanceof User))
            return false;
        return getUuid().equals(((User) obj).getUuid());
    }

    // Getters and setters follow
}

编辑:澄清我关于调用setUuid()方法的观点。下面是一个典型的场景:

User user = new User();
// user.setUuid(UUID.randomUUID()); // I should have called it here
user.setName("Master Yoda");
user.setEmail("yoda@jedicouncil.org");

jediSet.add(user); // here's bug - we forgot to set UUID and 
                   //we won't find Yoda in Jedi set

em.persist(user); // ensureUuid() was called and printed the log for me.

jediCouncilSet.add(user); // Ok, we got a UUID now

当我运行测试并看到日志输出时,我解决了这个问题:

User user = new User();
user.setUuid(UUID.randomUUID());

或者,也可以提供一个单独的构造函数:

@Entity
public class User {

    @Id
    private int id;  // Persistence ID
    private UUID uuid; // Business ID

    ... // fields

    // Constructor for Persistence provider to use
    public User() {
    }

    // Constructor I use when creating new entities
    public User(UUID uuid) {
        setUuid(uuid);
    }

    ... // rest of the entity.
}

我的例子是这样的:

User user = new User(UUID.randomUUID());
...
jediSet.add(user); // no bug this time

em.persist(user); // and no log output

我使用默认构造函数和setter,但您可能会发现双构造函数方法更适合您。

请考虑以下基于预定义类型标识符和ID的方法。

JPA的具体假设:

具有相同“类型”和相同非空ID的实体被认为是相等的 非持久化实体(假设没有ID)永远不等于其他实体

抽象实体:

@MappedSuperclass
public abstract class AbstractPersistable<K extends Serializable> {

  @Id @GeneratedValue
  private K id;

  @Transient
  private final String kind;

  public AbstractPersistable(final String kind) {
    this.kind = requireNonNull(kind, "Entity kind cannot be null");
  }

  @Override
  public final boolean equals(final Object obj) {
    if (this == obj) return true;
    if (!(obj instanceof AbstractPersistable)) return false;
    final AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
    return null != this.id
        && Objects.equals(this.id, that.id)
        && Objects.equals(this.kind, that.kind);
  }

  @Override
  public final int hashCode() {
    return Objects.hash(kind, id);
  }

  public K getId() {
    return id;
  }

  protected void setId(final K id) {
    this.id = id;
  }
}

具体实体示例:

static class Foo extends AbstractPersistable<Long> {
  public Foo() {
    super("Foo");
  }
}

测试的例子:

@Test
public void test_EqualsAndHashcode_GivenSubclass() {
  // Check contract
  EqualsVerifier.forClass(Foo.class)
    .suppress(Warning.NONFINAL_FIELDS, Warning.TRANSIENT_FIELDS)
    .withOnlyTheseFields("id", "kind")
    .withNonnullFields("id", "kind")
    .verify();
  // Ensure new objects are not equal
  assertNotEquals(new Foo(), new Foo());
}

主要优势:

简单 确保子类提供类型标识 使用代理类预测行为

缺点:

要求每个实体调用super()

注:

使用继承时需要注意。例如,类A和类B扩展A的实例相等性可能取决于应用程序的具体细节。 理想情况下,使用业务密钥作为ID

期待您的评论。

阅读这篇关于主题的非常好的文章:不要让Hibernate窃取您的身份。

文章的结论是这样的:

Object identity is deceptively hard to implement correctly when objects are persisted to a database. However, the problems stem entirely from allowing objects to exist without an id before they are saved. We can solve these problems by taking the responsibility of assigning object IDs away from object-relational mapping frameworks such as Hibernate. Instead, object IDs can be assigned as soon as the object is instantiated. This makes object identity simple and error-free, and reduces the amount of code needed in the domain model.