我不是很熟悉数据库及其工作原理。从性能的角度(插入/更新/查询),使用字符串作主键是否比整数慢?
当前回答
从性能的角度来看-与使用整数(PK)实现的性能相比,Yes字符串(PK)将降低性能,其中PK—>主键。
From requirement standpoint - Although this is not a part of your question still I would like to mention. When we are handling huge data across different tables we generally look for the probable set of keys that can be set for a particular table. This is primarily because there are many tables and mostly each or some table would be related to the other through some relation ( a concept of Foreign Key ). Therefore we really cannot always choose an integer as a Primary Key, rather we go for a combination of 3, 4 or 5 attributes as the primary key for that tables. And those keys can be used as a foreign key when we would relate the records with some other table. This makes it useful to relate the records across different tables when required.
因此,为了优化使用-我们总是将1或2个具有1或2个字符串属性的整数组合在一起,但同样只是在需要时才这样做。
其他回答
Inserts to a table having a clustered index where the insertion occurs in the middle of the sequence DOES NOT cause the index to be rewritten. It does not cause the pages comprising the data to be rewritten. If there is room on the page where the row will go, then it is placed in that page. The single page will be reformatted to place the row in the right place in the page. When the page is full, a page split will happen, with half of the rows on the page going to one page, and half going on the other. The pages are then relinked into the linked list of pages that comprise a tables data that has the clustered index. At most, you will end up writing 2 pages of database.
Strings are slower in joins and in real life they are very rarely really unique (even when they are supposed to be). The only advantage is that they can reduce the number of joins if you are joining to the primary table only to get the name. However, strings are also often subject to change thus creating the problem of having to fix all related records when the company name changes or the person gets married. This can be a huge performance hit and if all tables that should be related somehow are not related (this happens more often than you think), then you might have data mismatches as well. An integer that will never change through the life of the record is a far safer choice from a data integrity standpoint as well as from a performance standpoint. Natural keys are usually not so good for maintenance of the data.
我还想指出,两者的最佳方法通常是使用自递增键(或者在某些特殊情况下,使用GUID)作为PK,然后在自然键上放置唯一索引。您可以获得更快的连接,不会得到重复的记录,也不必因为公司名称更改而更新一百万个子记录。
是的,但除非您希望有数百万行,否则不使用基于字符串的键(因为它较慢)通常是“过早优化”。毕竟,字符串存储为大数字,而数字键通常存储为较小的数字。
不过,要注意的一件事是,如果您在任意键上聚集了索引,并且在索引中进行了大量的非顺序插入。写入的每一行都将导致索引重新写入。如果您正在进行批量插入,这确实会降低过程的速度。
在PK列中使用整数有两个原因:
我们可以为自动递增的整数字段设置标识。 当我们创建pk时,db会创建一个索引(Cluster或Non Cluster),在数据存储到表之前对其进行排序。通过在PK上使用标识,优化器在保存记录之前不需要检查排序顺序。这提高了大表的性能。
使用string作为主键的另一个问题是,由于索引不断按顺序排列,当创建一个新键时,索引必须重新排序……如果使用自动编号整数,则新键只添加到索引的末尾。
推荐文章
- 我如何检查如果一个变量是JavaScript字符串?
- 如何显示有两个小数点后的浮点数?
- 在MySQL中检测值是否为number
- MySQL中两个日期之间的差异
- 使用SQL查询查找最近的纬度/经度
- 在Lua中拆分字符串?
- 模式、表和数据库之间的区别是什么?
- 将一列的多个结果行连接为一列,按另一列分组
- 如何在Python中按字母顺序排序字符串中的字母
- 检查MySQL表是否存在而不使用“select from”语法?
- python: SyntaxError: EOL扫描字符串文字
- PHP子字符串提取。获取第一个'/'之前的字符串或整个字符串
- 我看到VARCHAR(255)如此频繁地使用(而不是其他长度),有什么好的原因吗?
- 使用pgadmin连接到heroku数据库
- 在PostgreSQL中快速发现表的行数