在领域驱动设计中,似乎有很多一致认为实体不应该直接访问存储库。

这是出自Eric Evans的《领域驱动设计》一书,还是来自其他地方?

对其背后的原因有什么好的解释吗?

编辑:澄清一下:我不是在谈论将数据访问从业务逻辑分离到一个单独层的经典OO实践——我是在谈论DDD中的具体安排,即实体根本不应该与数据访问层对话(即它们不应该持有对Repository对象的引用)。

更新:我把奖金给了BacceSR,因为他的答案似乎最接近,但我仍然对此一无所知。如果这是一个如此重要的原则,网上肯定会有一些关于它的好文章,不是吗?

更新:2013年3月,对这个问题的点赞表明人们对此很感兴趣,即使有很多答案,我仍然认为如果人们对此有想法,还有更多的空间。


当前回答

起初,我相信允许我的一些实体访问存储库(即没有ORM的惰性加载)。后来我得出结论,我不应该这样做,我可以找到其他方法:

We should know our intentions in a request and what we want from the domain, therefore we can make repository calls before constructing or invoking Aggregate behavior. This also helps avoid the problem of inconsistent in-memory state and the need for lazy loading (see this article). The smell is that you cannot create an in memory instance of your entity anymore without worrying about data access. CQS can help reduce the need for wanting to call the repository for things in our entities. We can use a specification to encapsulate and communicate domain logic needs and pass that to the repository instead (a service can orchestrate these things for us). The specification can come from the entity that is in charge of maintaining that invariant. The repository will interpret parts of the specification into it's own query implementation and apply rules from the specification on query results. This aims to keep domain logic in the domain layer. It also serves the Ubiquitous Language and communication better. Imagine saying "overdue order specification" versus saying "filter order from tbl_order where placed_at is less than 30 minutes before sysdate" (see this answer). It makes reasoning about the behavior of entities more difficult since the Single-Responsibility Principle is violated. If you need to work out storage/persistence issues you know where to go and where not to go. It avoids the danger of giving an entity bi-directional access to global state (via the repository and domain services). You also don't want to break your transaction boundary.

据我所知,Vernon Vaughn在红皮书《实现领域驱动设计》中有两个地方提到了这个问题(注意:这本书完全得到了Evans的支持,你可以在前言中读到)。在第7章关于服务的章节中,他使用域服务和规范来解决聚合使用存储库和另一个聚合来确定用户是否经过身份验证的需求。引用他的话说:

根据经验,我们应该尽量避免使用存储库 (12)从聚合体内部,如果可能的话。

沃恩·弗农(2013-02-06)。实现领域驱动设计(Kindle位置6089)。培生教育。Kindle版。

在关于聚合的第10章中,在“模型导航”一节中,他说(就在他建议使用全局唯一id来引用其他聚合根之后):

通过标识引用并不完全阻止导航通过 该模型。有些人会在聚合中使用存储库(12) 查找。这种技术称为断开域模型 它实际上是惰性加载的一种形式。有一个不同的建议 但是,方法是:使用存储库或域服务(7)来查找 在调用聚合行为之前调用依赖对象。一个客户端 应用服务可以控制这个,然后分派到聚合:

他接着展示了一个代码示例:

public class ProductBacklogItemService ... { 
    ...
    @Transactional 
    public void assignTeamMemberToTask( 
        String aTenantId, 
        String aBacklogItemId, 
        String aTaskId, 
        String aTeamMemberId) { 

        BacklogItem backlogItem = backlogItemRepository.backlogItemOfId( 
            new TenantId(aTenantId), 
            new BacklogItemId(aBacklogItemId)); 

        Team ofTeam = teamRepository.teamOfId( 
            backlogItem.tenantId(), 
            backlogItem.teamId());

        backlogItem.assignTeamMemberToTask( 
            new TeamMemberId( aTeamMemberId), 
            ofTeam,
            new TaskId( aTaskId));
   } 
   ...
}

他接着还提到了另一种解决方案,即如何在聚合命令方法中使用域服务以及双重分派。(读他的书大有益处,我怎么推荐都不为过。在你厌倦了无休止地在网上翻找之后,拿出你应得的钱去读这本书。)

然后我和Marco Pivetta @Ocramius进行了一些讨论,他向我展示了一些从域中提取规范并使用它的代码:

不建议这样做:

$user->mountFriends(); // <-- has a repository call inside that loads friends? 

在域服务中,这是很好的:

public function mountYourFriends(MountFriendsCommand $mount) {
    $user = $this->users->get($mount->userId()); 
    $friends = $this->users->findBySpecification($user->getFriendsSpecification()); 
    array_map([$user, 'mount'], $friends); 
}

其他回答

对我来说,这似乎是与OOD相关的一般良好实践,而不是DDD特有的。

我能想到的原因是:

关注点分离(实体应该与它们的持久化方式分离。因为根据使用场景的不同,可能存在多种策略,其中相同的实体将被持久化) 从逻辑上讲,实体可以在存储库操作的级别之下的级别中看到。较低级别的组件不应该具有较高级别组件的知识。因此,条目不应该有关于存储库的知识。

实体只捕获与其有效状态相关的规则。其中的数据有效吗?其中的数据可以这样改变吗?

聚合根对一组实体执行相同的操作。汇总的数据有效吗?总体数据能以这种方式改变吗?

域服务捕获关于实体或聚合之间更改的规则。我们可以这样改变X和Y吗?

None of this ever requires access to a repository or to infrastructure. What you do is that an application service will offer up a domain use case, for that use case, the application service will gather all the needed data from the repositories, that will return it your domain entities and/or aggregate roots and their value objects. The entities/aggregate roots and value objects would have validated that they are in a good state when created by the repository. Then the application service will use a combination of those entities (some of them could be aggregate roots), to perform the domain use case. If the domain use case requires changing X, Y and Z, the application service will ask X, Y and Z entities/aggregate roots if the current use case request of changes can be made to X, Y and Z, and if so, how should it be made. Finally, the application service will commit those changes back to the repository.

如果某些更改跨越实体或聚合,应用程序服务将使用域服务询问是否可以进行更改以及如何进行更改,并再次使用存储库提交这些更改。

如果一个域用例跨越多个有界上下文,这意味着它需要跨有界上下文的信息或更改,这被称为流程,并且您可以让一个流程服务管理整个流程生命周期,它将利用多个有界上下文的应用程序服务来跨所有有界上下文协调整个流程。

Finally, the application service can also use other application services, could be other micro-services in a shared bounded context, that would imply they share the same domain model, or it could do so across to application services in other bounded contexts, in which case you'd want to model those within your own bounded context's domain model as well, you'd treat those other bounded contexts much like a repository in a way. The application service communicates with another bounded context to get info about that other context, it then creates a representation of that info within its own domain model, using its own entities and VOs, and aggregates, which will again validate that state within their context. Similarly, you can commit changes to your domain model to other bounded contexts by asking them to change accordingly. All this can be implemented with direct method calls, remote API calls, async events, shared kernel, etc.

And to answer why it is like so, that's because the whole point is building software that can evolve over time without it becoming slower to make changes to it and add/modify its behavior while retaining its current correctness with regards to its current functionality. A good way to do this is by making it a change in one place doesn't break things elsewhere. This is why bounded contexts exist, already changes are restricted to each context, so a change in one is less likely to break another. This is also why the domain model validates all changes to the domain state, so you can't change part of the state in ways that breaks other usage of it. This is why aggregates are used, to maintain a change boundary between the things that need one, and clearly not have one where it doesn't need one. Finally, by having the whole domain layer, with domain model and domain services, not depend on any infrastructure, like the repository (and thus the DB), a change to the DB or repository will also not be able to break your domain model or services.

P.S.: Also note I use the term "state" loosely. It doesn't have to be a static value; state could be the application of some dynamic computation or rules that generates state when requested. You can have something like totalItemsCount on some entity which computes it when asked about what is the current totalItemsCount for the entity. Again, the entity will make sure to return you valid state, that means it will know how to correctly count the total and make sure that what is returned is the correct application of the domain rules for totalItemsCount.

弗农·沃恩给出了一个解决方案:

使用存储库或域服务提前查找依赖对象 调用聚合行为。客户端应用程序服务可以 控制这个问题。

To cite Carolina Lilientahl, "Patterns should prevent cycles" https://www.youtube.com/watch?v=eJjadzMRQAk, where she refers to cyclic dependencies between classes. In case of repositories inside aggregates, there is a temptation to create cyclic dependencies out of conveniance of object navigation as the only reason. The pattern mentioned above by prograhammer, that was recommended by Vernon Vaughn, where other aggregates are referenced by ids instead of root instances, (is there a name for this pattern?) suggests an alternative that might guide into other solutions.

类之间循环依赖的例子(忏悔):

(Time0): Sample和Well这两个类彼此引用(循环依赖)。Well指的是Sample,而Sample指的是Well,这是为了方便(有时是对样品进行循环,有时是对一个板中的所有孔进行循环)。我无法想象样本不会指向它所在的井。

(Time1):一年之后,许多用例实现了....现在有一些情况下,样本不应该指向它所在的井。在一个工作步骤内有临时板。这里的孔指的是样品,而样品又指的是另一个盘子上的孔。正因为如此,当有人试图实现新功能时,有时会出现奇怪的行为。渗透需要时间。

我也从上面提到的这篇关于惰性加载的负面方面的文章中得到了帮助。

这是一个非常好的问题。我期待着就此进行一些讨论。但我想DDD的几本书里都提到了吉米·尼尔森斯和埃里克·埃文斯。我想通过示例也可以看到如何使用存储库模式。

BUT lets discuss. I think a very valid thought is why should an entity know about how to persist another entity? Important with DDD is that each entity has a responsibility to manage its own "knowledge-sphere" and shouldn't know anything about how to read or write other entities. Sure you can probably just add a repository interface to Entity A for reading Entities B. But the risk is that you expose knowledge for how to persist B. Will entity A also do validation on B before persisting B into db?

正如您所看到的,实体A可以更多地参与到实体B的生命周期中,这可以为模型增加更多的复杂性。

我猜(没有任何例子)单元测试将会更加复杂。

但是我确信总会有这样的场景:您想通过实体来使用存储库。你必须考虑每一种情况才能做出有效的判断。优点和缺点。但是在我看来,存储库实体解决方案有很多缺点。它一定是一个非常特殊的场景,优点抵消了缺点....