我可以在一个表中有多个主键吗?
当前回答
主键是一个非常不幸的符号,因为“主键”的内涵和逻辑模型的潜意识关联。因此我避免使用它。相反,我指的是物理模型的代理键和逻辑模型的自然键。
It is important that the Logical Model for every Entity have at least one set of "business attributes" which comprise a Key for the entity. Boyce, Codd, Date et al refer to these in the Relational Model as Candidate Keys. When we then build tables for these Entities their Candidate Keys become Natural Keys in those tables. It is only through those Natural Keys that users are able to uniquely identify rows in the tables; as surrogate keys should always be hidden from users. This is because Surrogate Keys have no business meaning.
However the Physical Model for our tables will in many instances be inefficient without a Surrogate Key. Recall that non-covered columns for a non-clustered index can only be found (in general) through a Key Lookup into the clustered index (ignore tables implemented as heaps for a moment). When our available Natural Key(s) are wide this (1) widens the width of our non-clustered leaf nodes, increasing storage requirements and read accesses for seeks and scans of that non-clustered index; and (2) reduces fan-out from our clustered index increasing index height and index size, again increasing reads and storage requirements for our clustered indexes; and (3) increases cache requirements for our clustered indexes. chasing other indexes and data out of cache.
This is where a small Surrogate Key, designated to the RDBMS as "the Primary Key" proves beneficial. When set as the clustering key, so as to be used for key lookups into the clustered index from non-clustered indexes and foreign key lookups from related tables, all these disadvantages disappear. Our clustered index fan-outs increase again to reduce clustered index height and size, reduce cache load for our clustered indexes, decrease reads when accessing data through any mechanism (whether index scan, index seek, non-clustered key lookup or foreign key lookup) and decrease storage requirements for both clustered and nonclustered indexes of our tables.
注意,这些好处只在代理键和集群键都很小的情况下才会出现。如果使用GUID作为集群键,情况通常会比使用最小的可用自然键更糟糕。如果表被组织为一个堆,那么8字节(堆)RowID将用于键查找,这比16字节的GUID更好,但性能不如4字节的整数。
如果由于业务限制必须使用GUID,那么搜索一个更好的集群键是值得的。例如,如果一个小的站点标识符和4字节的“站点序列号”是可行的,那么这种设计可能会比GUID作为代理键提供更好的性能。
如果堆的结果(可能是哈希连接)使其成为首选存储,那么需要在权衡分析中平衡更广泛的集群键的成本。
考虑这个例子:
ALTER TABLE Persons
ADD CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName)
where the tuple "(P_Id,LastName)" requires a uniqueness constraint, and may be a lengthy Unicode LastName plus a 4-byte integer, it would be desirable to (1) declaratively enforce this constraint as "ADD CONSTRAINT pk_PersonID UNIQUE NONCLUSTERED (P_Id,LastName)" and (2) separately declare a small Surrogate Key to be the "Primary Key" of a clustered index. It is worth noting that Anita possibly only wishes to add the LastName to this constraint in order to make that a covered field, which is unnecessary in a clustered index because ALL fields are covered by it.
在SQL Server中将主键指定为非聚集的能力是一个不幸的历史情况,这是由于“首选自然或候选键”(来自逻辑模型)的含义与“存储中的查找键”(来自物理模型)的含义的合并。我的理解是,最初SYBASE SQL Server总是使用一个4字节的RowID,无论是在堆中还是在聚集索引中,作为物理模型中的“存储中的查找键”。
其他回答
有些人使用术语“主键”来确切地表示一个整型列,它的值是由某种自动机制生成的。例如MySQL中的AUTO_INCREMENT或Microsoft SQL Server中的IDENTITY。你是在使用主键吗?
如果是,答案取决于您使用的数据库的品牌。在MySQL中,你不能这样做,你会得到一个错误:
mysql> create table foo (
id int primary key auto_increment,
id2 int auto_increment
);
ERROR 1075 (42000): Incorrect table definition;
there can be only one auto column and it must be defined as a key
在其他一些数据库中,您可以在一个表中定义多个自动生成列。
这是对主要问题和@Kalmi的问题的答案
拥有多个自动生成列的意义何在?
下面的代码有一个复合主键。其中一列是自动递增的。这将只在MyISAM工作。InnoDB将生成一个错误" error 1075(42000):错误的表定义;只能有一个auto列,它必须定义为一个键”。
DROP TABLE IF EXISTS `test`.`animals`;
CREATE TABLE `test`.`animals` (
`grp` char(30) NOT NULL,
`id` mediumint(9) NOT NULL AUTO_INCREMENT,
`name` char(30) NOT NULL,
PRIMARY KEY (`grp`,`id`)
) ENGINE=MyISAM;
INSERT INTO animals (grp,name) VALUES
('mammal','dog'),('mammal','cat'),
('bird','penguin'),('fish','lax'),('mammal','whale'),
('bird','ostrich');
SELECT * FROM animals ORDER BY grp,id;
Which returns:
+--------+----+---------+
| grp | id | name |
+--------+----+---------+
| fish | 1 | lax |
| mammal | 1 | dog |
| mammal | 2 | cat |
| mammal | 3 | whale |
| bird | 1 | penguin |
| bird | 2 | ostrich |
+--------+----+---------+
正如其他人所指出的,可以有多列主键。 但是需要注意的是,如果有一些函数依赖关系不是由键引入的,那么应该考虑规范化关系。
例子:
Person(id, name, email, street, zip_code, area)
id ->名称、电子邮件、街道、zip_code和区域之间可能存在功能依赖关系 但zip_code通常与区域相关联,因此zip_code ->区域之间存在内部函数依赖关系。
因此,可以考虑将它分割到另一个表中:
Person(id, name, email, street, zip_code)
Area(zip_code, name)
所以它和第三种形式是一致的。
同时有两个主键是不可能的。但是(假设你没有把复合键搞砸),你可能需要的是让一个属性是唯一的。
CREATE t1(
c1 int NOT NULL,
c2 int NOT NULL UNIQUE,
...,
PRIMARY KEY (c1)
);
However note that in relational database a 'super key' is a subset of attributes which uniquely identify a tuple or row in a table. A 'key' is a 'super key' that has an additional property that removing any attribute from the key, makes that key no more a 'super key'(or simply a 'key' is a minimal super key). If there are more keys, all of them are candidate keys. We select one of the candidate keys as a primary key. That's why talking about multiple primary keys for a one relation or table is being a conflict.
他们给出了比我更好的技术答案。 我只能补充一下这个话题:
如果你想要一些不被允许/可接受的东西,这是一个很好的理由后退一步。
理解为什么这是不可接受的核心原因。 挖掘更多的文档/期刊文章/网络等。 分析/审查当前设计并指出主要缺陷。 在新设计中考虑和测试每一个步骤。 永远向前看,努力创造适应性的解决方案。
希望它能帮助到一些人。
推荐文章
- 数据库触发器是必要的吗?
- 为什么我应该使用基于文档的数据库而不是关系数据库?
- 哪个更快/最好?SELECT *或SELECT columnn1, colum2, column3等
- 将值从同一表中的一列复制到另一列
- 什么是数据库池?
- 关于数据库,每个开发人员应该知道些什么?
- INT和VARCHAR主键之间有真正的性能差异吗?
- "where 1=1"语句
- 是使用各有一个模式的多个数据库更好,还是使用一个数据库有多个模式更好?
- 如何从Oracle的表中获取列名?
- 如何检查表上持有哪些锁
- 在MongoDB中查找重复的记录
- 对于PostgreSQL表来说,多大才算太大?
- 模式、表和数据库之间的区别是什么?
- 我看到VARCHAR(255)如此频繁地使用(而不是其他长度),有什么好的原因吗?