与自动递增的数字相比,电子邮件地址是一个糟糕的初选候选人吗?
我们的web应用程序需要电子邮件地址在系统中是唯一的。所以,我想到使用电子邮件地址为主键。然而,我的同事认为字符串比较将比整数比较慢。
这是一个有效的理由不使用电子邮件为主键吗?
我们使用的是PostgreSQL。
与自动递增的数字相比,电子邮件地址是一个糟糕的初选候选人吗?
我们的web应用程序需要电子邮件地址在系统中是唯一的。所以,我想到使用电子邮件地址为主键。然而,我的同事认为字符串比较将比整数比较慢。
这是一个有效的理由不使用电子邮件为主键吗?
我们使用的是PostgreSQL。
当前回答
使用电子邮件地址作为主键的缺点:
Slower when doing joins. Any other record with a posted foreign key now has a larger value, taking up more disk space. (Given the cost of disk space today, this is probably a trivial issue, except to the extent that the record now takes longer to read. See #1.) An email address could change, which forces all records using this as a foreign key to be updated. As email address don't change all that often, the performance problem is probably minor. The bigger problem is that you have to make sure to provide for it. If you have to write the code, this is more work and introduces the possibility of bugs. If your database engine supports "on update cascade", it's a minor issue.
使用电邮地址作主键的优点:
You may be able to completely eliminate some joins. If all you need from the "master record" is the email address, then with an abstract integer key you would have to do a join to retrieve it. If the key is the email address, then you already have it and the join is unnecessary. Whether this helps you any depends on how often this situation comes up. When you are doing ad hoc queries, it's easy for a human being to see what master record is being referenced. This can be a big help when trying to track down data problems. You almost certainly will need an index on the email address anyway, so making it the primary key eliminates one index, thus improving the performance of inserts as they now have only one index to update instead of two.
在我看来,这两种情况都不是十拿九稳的。当有实用的键时,我倾向于使用自然键,因为它们更容易使用,而且在大多数情况下,缺点并不太重要。
其他回答
使用电子邮件地址作为主键的缺点:
Slower when doing joins. Any other record with a posted foreign key now has a larger value, taking up more disk space. (Given the cost of disk space today, this is probably a trivial issue, except to the extent that the record now takes longer to read. See #1.) An email address could change, which forces all records using this as a foreign key to be updated. As email address don't change all that often, the performance problem is probably minor. The bigger problem is that you have to make sure to provide for it. If you have to write the code, this is more work and introduces the possibility of bugs. If your database engine supports "on update cascade", it's a minor issue.
使用电邮地址作主键的优点:
You may be able to completely eliminate some joins. If all you need from the "master record" is the email address, then with an abstract integer key you would have to do a join to retrieve it. If the key is the email address, then you already have it and the join is unnecessary. Whether this helps you any depends on how often this situation comes up. When you are doing ad hoc queries, it's easy for a human being to see what master record is being referenced. This can be a big help when trying to track down data problems. You almost certainly will need an index on the email address anyway, so making it the primary key eliminates one index, thus improving the performance of inserts as they now have only one index to update instead of two.
在我看来,这两种情况都不是十拿九稳的。当有实用的键时,我倾向于使用自然键,因为它们更容易使用,而且在大多数情况下,缺点并不太重要。
您应该使用整数主键。如果你需要电子邮件列是唯一的,为什么不简单地在该列上设置一个唯一索引呢?
我不知道这在您的设置中是否可能是一个问题,但根据您的RDBMS,列的值可能是区分大小写的。PostgreSQL文档说:“如果你声明一个列为UNIQUE或PRIMARY KEY,隐式生成的索引是区分大小写的”。换句话说,如果您在一个以email为主键的表中接受用户输入进行搜索,并且用户提供“John@Doe.com”,那么您将找不到“john@doe.com”。
是的,如果您使用整数来代替会更好。您还可以将电子邮件列设置为唯一约束。
是这样的:
CREATE TABLE myTable(
id integer primary key,
email text UNIQUE
);
是的,这是一个糟糕的主键,因为你的用户会想要更新他们的电子邮件地址。