我不是很熟悉数据库及其工作原理。从性能的角度(插入/更新/查询),使用字符串作主键是否比整数慢?
当前回答
Strings are slower in joins and in real life they are very rarely really unique (even when they are supposed to be). The only advantage is that they can reduce the number of joins if you are joining to the primary table only to get the name. However, strings are also often subject to change thus creating the problem of having to fix all related records when the company name changes or the person gets married. This can be a huge performance hit and if all tables that should be related somehow are not related (this happens more often than you think), then you might have data mismatches as well. An integer that will never change through the life of the record is a far safer choice from a data integrity standpoint as well as from a performance standpoint. Natural keys are usually not so good for maintenance of the data.
我还想指出,两者的最佳方法通常是使用自递增键(或者在某些特殊情况下,使用GUID)作为PK,然后在自然键上放置唯一索引。您可以获得更快的连接,不会得到重复的记录,也不必因为公司名称更改而更新一百万个子记录。
其他回答
使用string作为主键的另一个问题是,由于索引不断按顺序排列,当创建一个新键时,索引必须重新排序……如果使用自动编号整数,则新键只添加到索引的末尾。
使用什么作为主键并不重要,只要它是UNIQUE即可。如果您关心速度或良好的数据库设计,请使用int型,除非您计划复制数据,否则请使用GUID。
如果这是一个访问数据库或一些小应用程序,那么谁真的在乎。我认为,我们大多数开发人员之所以把旧的int或guid放在前面,是因为我们有一种方式来发展项目,并且您希望给自己留下发展的选择。
是的,但除非您希望有数百万行,否则不使用基于字符串的键(因为它较慢)通常是“过早优化”。毕竟,字符串存储为大数字,而数字键通常存储为较小的数字。
不过,要注意的一件事是,如果您在任意键上聚集了索引,并且在索引中进行了大量的非顺序插入。写入的每一行都将导致索引重新写入。如果您正在进行批量插入,这确实会降低过程的速度。
Inserts to a table having a clustered index where the insertion occurs in the middle of the sequence DOES NOT cause the index to be rewritten. It does not cause the pages comprising the data to be rewritten. If there is room on the page where the row will go, then it is placed in that page. The single page will be reformatted to place the row in the right place in the page. When the page is full, a page split will happen, with half of the rows on the page going to one page, and half going on the other. The pages are then relinked into the linked list of pages that comprise a tables data that has the clustered index. At most, you will end up writing 2 pages of database.
Strings are slower in joins and in real life they are very rarely really unique (even when they are supposed to be). The only advantage is that they can reduce the number of joins if you are joining to the primary table only to get the name. However, strings are also often subject to change thus creating the problem of having to fix all related records when the company name changes or the person gets married. This can be a huge performance hit and if all tables that should be related somehow are not related (this happens more often than you think), then you might have data mismatches as well. An integer that will never change through the life of the record is a far safer choice from a data integrity standpoint as well as from a performance standpoint. Natural keys are usually not so good for maintenance of the data.
我还想指出,两者的最佳方法通常是使用自递增键(或者在某些特殊情况下,使用GUID)作为PK,然后在自然键上放置唯一索引。您可以获得更快的连接,不会得到重复的记录,也不必因为公司名称更改而更新一百万个子记录。
推荐文章
- 如何在Ruby On Rails中使用NuoDB手动执行SQL命令
- 查询JSON类型内的数组元素
- 确定记录是否存在的最快方法
- Printf与std::字符串?
- 获得PostgreSQL数据库中当前连接数的正确查询
- 不区分大小写的“in”
- 在SQL选择语句Order By 1的目的是什么?
- MySQL数据库表中的最大记录数
- 我如何得到一个字符串的前n个字符而不检查大小或出界?
- 从现有模式生成表关系图(SQL Server)
- 如何在PHP中截断字符串最接近于一定数量的字符?
- 我如何循环通过一组记录在SQL Server?
- HyperLogLog算法是如何工作的?
- 数据库和模式的区别
- Ruby数组到字符串的转换