我们所有使用关系数据库的人都知道(或正在学习)SQL是不同的。获得期望的结果,并有效地这样做,涉及到一个乏味的过程,其部分特征是学习不熟悉的范例,并发现一些我们最熟悉的编程模式在这里不起作用。常见的反模式是什么?


当前回答

在他们职业生涯的前6个月学习SQL,在接下来的10年里从不学习其他任何东西。特别是没有学习或有效地使用窗口/分析SQL特性。特别是over()和partition by的使用。

窗口函数,比如聚合 函数时,对对象进行聚合 定义的行集(组),但是 而不是返回一个值 组,窗口函数可以返回 每个组有多个值。

请参阅O'Reilly SQL Cookbook附录A,以获得窗口函数的良好概述。

其他回答

没有注释的存储过程或函数…

使用@@IDENTITY代替SCOPE_IDENTITY()

引自以下回答:

@@IDENTITY returns the last identity value generated for any table in the current session, across all scopes. You need to be careful here, since it's across scopes. You could get a value from a trigger, instead of your current statement. SCOPE_IDENTITY returns the last identity value generated for any table in the current session and the current scope. Generally what you want to use. IDENT_CURRENT returns the last identity value generated for a specific table in any session and any scope. This lets you specify which table you want the value from, in case the two above aren't quite what you need (very rare). You could use this if you want to get the current IDENTITY value for a table that you have not inserted a record into.

我看到视图定义是这样的:

CREATE OR REPLACE FORCE VIEW PRICE (PART_NUMBER, PRICE_LIST, LIST_VERSION ...)
AS
  SELECT sp.MKT_PART_NUMBER,
    sp.PRICE_LIST,
    sp.LIST_VERSION,
    sp.MIN_PRICE,
    sp.UNIT_PRICE,
    sp.MAX_PRICE,
...

视图中大约有50个列。有些开发人员以不提供列别名而折磨他人为傲,因此必须计算两个位置的列偏移量,以便能够找出视图中对应的列。

编写查询的开发人员没有很好地了解SQL应用程序(包括单个查询和多用户系统)的快慢。这包括对以下方面的无知:

physical I/O minimization strategies, given that most queries' bottleneck is I/O not CPU perf impact of different kinds of physical storage access (e.g. lots of sequential I/O will be faster than lots of small random I/O, although less so if your physical storage is an SSD!) how to hand-tune a query if the DBMS produces a poor query plan how to diagnose poor database performance, how to "debug" a slow query, and how to read a query plan (or EXPLAIN, depending on your DBMS of choice) locking strategies to optimize throughput and avoid deadlocks in multi-user applications importance of batching and other tricks to handle processing of data sets table and index design to best balance space and performance (e.g. covering indexes, keeping indexes small where possible, reducing data types to minimum size needed, etc.)

没有使用With子句或适当的连接并依赖子查询。

反模式:

select 
 ...
from data
where RECORD.STATE IN (
          SELECT STATEID
            FROM STATE
           WHERE NAME IN
                    ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    ))

好: 我喜欢使用with子句使我的意图更易于阅读。

with valid_states as (
          SELECT STATEID
            FROM STATE
           WHERE NAME IN
                    ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    )
select  ... from data, valid_states
where data.state = valid_states.state

最好的:

select 
  ... 
from data join states using (state)
where 
states.state in  ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    )