我可以在一个表中有多个主键吗?
当前回答
主键是一个非常不幸的符号,因为“主键”的内涵和逻辑模型的潜意识关联。因此我避免使用它。相反,我指的是物理模型的代理键和逻辑模型的自然键。
It is important that the Logical Model for every Entity have at least one set of "business attributes" which comprise a Key for the entity. Boyce, Codd, Date et al refer to these in the Relational Model as Candidate Keys. When we then build tables for these Entities their Candidate Keys become Natural Keys in those tables. It is only through those Natural Keys that users are able to uniquely identify rows in the tables; as surrogate keys should always be hidden from users. This is because Surrogate Keys have no business meaning.
However the Physical Model for our tables will in many instances be inefficient without a Surrogate Key. Recall that non-covered columns for a non-clustered index can only be found (in general) through a Key Lookup into the clustered index (ignore tables implemented as heaps for a moment). When our available Natural Key(s) are wide this (1) widens the width of our non-clustered leaf nodes, increasing storage requirements and read accesses for seeks and scans of that non-clustered index; and (2) reduces fan-out from our clustered index increasing index height and index size, again increasing reads and storage requirements for our clustered indexes; and (3) increases cache requirements for our clustered indexes. chasing other indexes and data out of cache.
This is where a small Surrogate Key, designated to the RDBMS as "the Primary Key" proves beneficial. When set as the clustering key, so as to be used for key lookups into the clustered index from non-clustered indexes and foreign key lookups from related tables, all these disadvantages disappear. Our clustered index fan-outs increase again to reduce clustered index height and size, reduce cache load for our clustered indexes, decrease reads when accessing data through any mechanism (whether index scan, index seek, non-clustered key lookup or foreign key lookup) and decrease storage requirements for both clustered and nonclustered indexes of our tables.
注意,这些好处只在代理键和集群键都很小的情况下才会出现。如果使用GUID作为集群键,情况通常会比使用最小的可用自然键更糟糕。如果表被组织为一个堆,那么8字节(堆)RowID将用于键查找,这比16字节的GUID更好,但性能不如4字节的整数。
如果由于业务限制必须使用GUID,那么搜索一个更好的集群键是值得的。例如,如果一个小的站点标识符和4字节的“站点序列号”是可行的,那么这种设计可能会比GUID作为代理键提供更好的性能。
如果堆的结果(可能是哈希连接)使其成为首选存储,那么需要在权衡分析中平衡更广泛的集群键的成本。
考虑这个例子:
ALTER TABLE Persons
ADD CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)
where the tuple "(P_Id,LastName)" requires a uniqueness constraint, and may be a lengthy Unicode LastName plus a 4-byte integer, it would be desirable to (1) declaratively enforce this constraint as "ADD CONSTRAINT pk_PersonID UNIQUE NONCLUSTERED (P_Id,LastName)" and (2) separately declare a small Surrogate Key to be the "Primary Key" of a clustered index. It is worth noting that Anita possibly only wishes to add the LastName to this constraint in order to make that a covered field, which is unnecessary in a clustered index because ALL fields are covered by it.
在SQL Server中将主键指定为非聚集的能力是一个不幸的历史情况,这是由于“首选自然或候选键”(来自逻辑模型)的含义与“存储中的查找键”(来自物理模型)的含义的合并。我的理解是,最初SYBASE SQL Server总是使用一个4字节的RowID,无论是在堆中还是在聚集索引中,作为物理模型中的“存储中的查找键”。
其他回答
一个表可以有多个候选键。每个候选键都是一个列或一组列,它们是UNIQUE的,加在一起也是NOT NULL。因此,为任何候选键的所有列指定值就足以确定是否有一行满足条件,或者根本没有行。
候选键是关系数据模型中的一个基本概念。
通常的做法是,如果一个表中存在多个键,则将其中一个候选键指定为主键。通常的做法是使表的任何外键引用主键,而不是任何其他候选键。
我推荐这些实践,但是关系模型中没有要求在候选键中选择主键。
只能有一个主键,但可以在主键中有多个列。
你也可以在你的表上有唯一索引,这将有点像一个主键,因为它们将强制唯一的值,并将加快这些值的查询。
主键是唯一标识一条记录并在所有索引中使用的键。这就是为什么你只能有一个。它通常也是用于连接子表的键,但这不是必需的。PK的真正目的是确保某些东西允许您惟一地标识一条记录,这样数据更改就会影响正确的记录,从而可以创建索引。
However, you can put multiple fields in one primary key (a composite PK). This will make your joins slower (espcially if they are larger string type fields) and your indexes larger but it may remove the need to do joins in some of the child tables, so as far as performance and design, take it on a case by case basis. When you do this, each field itself is not unique, but the combination of them is. If one or more of the fields in a composite key should also be unique, then you need a unique index on it. It is likely though that if one field is unique, this is a better candidate for the PK.
Now at times, you have more than one candidate for the PK. In this case you choose one as the PK or use a surrogate key (I personally prefer surrogate keys for this instance). And (this is critical!) you add unique indexes to each of the candidate keys that were not chosen as the PK. If the data needs to be unique, it needs a unique index whether it is the PK or not. This is a data integrity issue. (Note this is also true anytime you use a surrogate key; people get into trouble with surrogate keys because they forget to create unique indexes on the candidate keys.)
There are occasionally times when you want more than one surrogate key (which are usually the PK if you have them). In this case what you want isn't more PK's, it is more fields with autogenerated keys. Most DBs don't allow this, but there are ways of getting around it. First consider if the second field could be calculated based on the first autogenerated key (Field1 * -1 for instance) or perhaps the need for a second autogenerated key really means you should create a related table. Related tables can be in a one-to-one relationship. You would enforce that by adding the PK from the parent table to the child table and then adding the new autogenerated field to the table and then whatever fields are appropriate for this table. Then choose one of the two keys as the PK and put a unique index on the other (the autogenerated field does not have to be a PK). And make sure to add the FK to the field that is in the parent table. In general if you have no additional fields for the child table, you need to examine why you think you need two autogenerated fields.
正如其他人所指出的,可以有多列主键。 但是需要注意的是,如果有一些函数依赖关系不是由键引入的,那么应该考虑规范化关系。
例子:
Person(id, name, email, street, zip_code, area)
id ->名称、电子邮件、街道、zip_code和区域之间可能存在功能依赖关系 但zip_code通常与区域相关联,因此zip_code ->区域之间存在内部函数依赖关系。
因此,可以考虑将它分割到另一个表中:
Person(id, name, email, street, zip_code)
Area(zip_code, name)
所以它和第三种形式是一致的。
他们给出了比我更好的技术答案。 我只能补充一下这个话题:
如果你想要一些不被允许/可接受的东西,这是一个很好的理由后退一步。
理解为什么这是不可接受的核心原因。 挖掘更多的文档/期刊文章/网络等。 分析/审查当前设计并指出主要缺陷。 在新设计中考虑和测试每一个步骤。 永远向前看,努力创造适应性的解决方案。
希望它能帮助到一些人。
推荐文章
- 非加密用途的最快哈希?
- 查询以列出数据库中每个表中的记录数量
- H2内存数据库。表未找到
- 如何在postgresql中创建数据库用户?
- 不可重复读和幻影读的区别是什么?
- 外键约束:何时使用ON UPDATE和ON DELETE
- 连接查询vs多个查询
- MySQL:在同一个MySQL实例上克隆MySQL数据库
- 优化PostgreSQL进行快速测试
- 表被标记为崩溃,应该修复
- 在Android SQLite中处理日期的最佳方法
- 使用{merge: true}设置的Firestore与更新之间的差异
- 向现有表添加主键
- mysql_connect():[2002]没有这样的文件或目录(试图通过unix:///tmp/mysql.sock连接)在
- 使用电子邮件地址为主键?