我记得在播客014中听到Joel Spolsky提到他几乎从未使用过外键(如果我没记错的话)。然而,对我来说,它们对于避免数据库中的重复和后续数据完整性问题非常重要。
人们是否有一些可靠的理由(以避免与Stack Overflow原则一致的讨论)?
编辑:“我还没有创建外键的理由,所以这可能是我真正建立一个外键的第一个理由。”
我记得在播客014中听到Joel Spolsky提到他几乎从未使用过外键(如果我没记错的话)。然而,对我来说,它们对于避免数据库中的重复和后续数据完整性问题非常重要。
人们是否有一些可靠的理由(以避免与Stack Overflow原则一致的讨论)?
编辑:“我还没有创建外键的理由,所以这可能是我真正建立一个外键的第一个理由。”
当前回答
@ emphasis——这正是导致维护噩梦的心态。
为什么,哦,为什么要忽略声明性引用完整性(其中数据至少可以保证一致),而支持所谓的“软件强制”,这充其量是一种薄弱的预防措施。
其他回答
我同意德米特里的回答,说得很好。
For those who are worried about the performance overhead FK's often bring, there's a way (in Oracle) you can get the query optimiser advantage of the FK constraint without the cost overhead of constraint validation during insert, delete or update. That is to create the FK constraint with the attributes RELY DISABLE NOVALIDATE. This means the query optimiser ASSUMES that the constraint has been enforced when building queries, without the database actually enforcing the constraint. You have to be very careful here to take the responsibility when you populate a table with an FK constraint like this to make absolutely sure you don't have data in your FK column(s) that violate the constraint, as if you do so you could get unreliable results from queries that involve the table this FK constraint is on.
我通常在数据集市模式中的某些表上使用这种策略,但在集成登台模式中不使用。我要确保复制数据的表已经强制执行了相同的约束,或者ETL例程强制执行了该约束。
使用外键的原因:
you won't get Orphaned Rows you can get nice "on delete cascade" behavior, automatically cleaning up tables knowing about the relationships between tables in the database helps the Optimizer plan your queries for most efficient execution, since it is able to get better estimates on join cardinality. FKs give a pretty big hint on what statistics are most important to collect on the database, which in turn leads to better performance they enable all kinds of auto-generated support -- ORMs can generate themselves, visualization tools will be able to create nice schema layouts for you, etc. someone new to the project will get into the flow of things faster since otherwise implicit relationships are explicitly documented
不使用外键的原因:
you are making the DB work extra on every CRUD operation because it has to check FK consistency. This can be a big cost if you have a lot of churn by enforcing relationships, FKs specify an order in which you have to add/delete things, which can lead to refusal by the DB to do what you want. (Granted, in such cases, what you are trying to do is create an Orphaned Row, and that's not usually a good thing). This is especially painful when you are doing large batch updates, and you load up one table before another, with the second table creating consistent state (but should you be doing that sort of thing if there is a possibility that the second load fails and your database is now inconsistent?). sometimes you know beforehand your data is going to be dirty, you accept that, and you want the DB to accept it you are just being lazy :-)
我认为(我不确定!)大多数已建立的数据库都提供了一种指定外键的方法,这种方法不是强制的,只是一些元数据。由于不强制执行消除了不使用fk的所有理由,如果第二部分中的任何理由适用,您可能应该走那条路。
One time when an FK might cause you a problem is when you have historical data that references the key (in a lookup table) even though you no longer want the key available. Obviously the solution is to design things better up front, but I am thinking of real world situations here where you don't always have control of the full solution. For example: perhaps you have a look up table customer_type that lists different types of customers - lets say you need to remove a certain customer type, but (due to business restraints) aren't able to update the client software, and nobody invisaged this situation when developing the software, the fact that it is a foreign key in some other table may prevent you from removing the row even though you know the historical data that references it is irrelevant. After being burnt with this a few times you probably lean away from db enforcement of relationships. (I'm not saying this is good - just giving a reason why you may decide to avoid FKs and db contraints in general)
更新:我现在总是使用外键。对于反对意见“他们使测试变得复杂”,我的回答是“编写单元测试,这样他们就根本不需要数据库。任何使用该数据库的测试都应该正确地使用它,这包括外键。如果准备工作很痛苦,那就找一种不那么痛苦的方式来做。”
外键使自动化测试复杂化
假设您正在使用外键。您正在编写一个自动测试,该测试表示“当我更新财务帐户时,它应该保存交易记录。”在这个测试中,您只关心两个表:帐户和事务。
但是,accounts对契约有一个外键,契约对客户有一个fk,客户对城市有一个fk,城市对州有一个fk。
现在,数据库将不允许您运行测试,除非在四个与测试无关的表中设置数据。
至少有两种可能的观点:
“这是一件好事:你的测试应该是现实的,这些数据限制将存在于生产中。” “这是一件坏事:你应该能够在不涉及其他部分的情况下对系统的各个部分进行单元测试。您可以为整个系统添加集成测试。”
也可以在运行测试时暂时关闭外键检查。至少MySQL支持这一点。
在这里回答问题的许多人都过于关注通过引用约束实现的引用完整性的重要性。在具有引用完整性的大型数据库上工作性能不佳。Oracle似乎特别不擅长级联删除。我的经验法则是,应用程序永远不应该直接更新数据库,而应该通过存储过程更新。这将代码库保存在数据库中,并意味着数据库保持其完整性。
在许多应用程序可能正在访问数据库的地方,由于引用完整性约束确实会出现问题,但这取决于控件。
还有一个更广泛的问题,应用程序开发人员可能有非常不同的需求,而数据库开发人员可能并不那么熟悉。