我听说在编写SQL命令时使用SELECT *通常是不好的做法,因为选择您特别需要的列会更有效。

如果我需要选择表中的每一列,我应该使用

SELECT * FROM TABLE

or

SELECT column1, colum2, column3, etc. FROM TABLE

在这种情况下,效率真的重要吗?如果你真的需要所有的数据,我认为SELECT *在内部会更优,但我这么说并没有真正理解数据库。

我很好奇在这种情况下最好的做法是什么。

更新:我可能应该指定,我真正想要执行SELECT *的唯一情况是,当我从一个表中选择数据时,我知道总是需要检索所有列,即使添加了新列。

然而,鉴于我所看到的反应,这似乎仍然是一个坏主意,由于我曾经考虑过的许多技术原因,SELECT *不应该被使用。


当前回答

在某些情况下,SELECT *适用于维护目的,但一般情况下应该避免使用。

These are special cases like views or stored procedures where you want changes in underlying tables to propagate without needing to go and change every view and stored proc which uses the table. Even then, this can cause problems itself, like in the case where you have two views which are joined. One underlying table changes and now the view is ambiguous because both tables have a column with the same name. (Note this can happen any time you don't qualify all your columns with table prefixes). Even with prefixes, if you have a construct like:

选择A., B. -您可能会遇到客户端现在难以选择正确字段的问题。

一般来说,我不使用SELECT *,除非我在做一个有意识的设计决策,并指望相关的风险很低。

其他回答

结果太大了。从SQL引擎生成结果并将结果发送到客户机的速度很慢。

客户端是一个通用的编程环境,不是也不应该被设计为过滤和处理结果(例如WHERE子句,ORDER子句),因为行数可能非常大(例如数千万行)。

记住,如果根据定义有一个内部连接,则不需要所有列,因为连接列中的数据是重复的。

It's not like listing columns in SQl server is hard or even time-consuming. You just drag them over from the object browser (you can get all in one go by dragging from the word columns). To put a permanent performance hit on your system (becasue this can reduce the use of indexes and becasue sending unneeded data over the network is costly) and make it more likely that you will have unexpected problems as the database changes (sometimes columns get added that you do not want the user to see for instance) just to save less than a minute of development time is short-sighted and unprofessional.

让我们想想哪一个更快。如果你可以选择你需要的数据,那么速度会更快。然而,在测试中,您可以提取所有数据,以判断哪些数据可以根据业务需求过滤掉。

这取决于您的DB服务器的版本,但现代版本的SQL可以以任何一种方式缓存该计划。我想说的是,不管你的数据访问代码有什么可维护性,你都要使用它。

即使查询不是通过网络发送,SELECT *也是一种糟糕的做法。

Selecting more data than you need makes the query less efficient - the server has to read and transfer extra data, so it takes time and creates unnecessary load on the system (not only the network, as others mentioned, but also disk, CPU etc.). Additionally, the server is unable to optimize the query as well as it might (for example, use covering index for the query). After some time your table structure might change, so SELECT * will return a different set of columns. So, your application might get a dataset of unexpected structure and break somewhere downstream. Explicitly stating the columns guarantees that you either get a dataset of known structure, or get a clear error on the database level (like 'column not found').

当然,对于一个小而简单的系统来说,所有这些都不太重要。