应用程序开发人员常见的数据库开发错误有哪些?


当前回答

相关子查询导致的性能差

大多数情况下,您希望避免相关子查询。如果子查询中存在对外部查询的列的引用,则子查询是相关的。当发生这种情况时,对于返回的每一行至少执行一次子查询,如果在应用包含相关子查询的条件之后应用其他条件,则可以执行更多次。

请原谅这个不自然的示例和Oracle语法,但假设您想要找到自上次商店每天销售额低于10,000美元以来在任何商店中雇用的所有员工。

select e.first_name, e.last_name
from employee e
where e.start_date > 
        (select max(ds.transaction_date)
         from daily_sales ds
         where ds.store_id = e.store_id and
               ds.total < 10000)

本例中的子查询通过store_id与外部查询相关联,并将对系统中的每个员工执行。优化此查询的一种方法是将子查询移动到内联视图。

select e.first_name, e.last_name
from employee e,
     (select ds.store_id,
             max(s.transaction_date) transaction_date
      from daily_sales ds
      where ds.total < 10000
      group by s.store_id) dsx
where e.store_id = dsx.store_id and
      e.start_date > dsx.transaction_date

In this example, the query in the from clause is now an inline-view (again some Oracle specific syntax) and is only executed once. Depending on your data model, this query will probably execute much faster. It would perform better than the first query as the number of employees grew. The first query could actually perform better if there were few employees and many stores (and perhaps many of stores had no employees) and the daily_sales table was indexed on store_id. This is not a likely scenario but shows how a correlated query could possibly perform better than an alternative.

我曾多次看到初级开发人员关联子查询,这通常会对性能产生严重影响。但是,当删除一个相关的子查询时,一定要查看之前和之后的解释计划,以确保您没有使性能变差。

其他回答

对于基于sql的数据库:

Not taking advantage of CLUSTERED INDEXES or choosing the wrong column(s) to CLUSTER. Not using a SERIAL (autonumber) datatype as a PRIMARY KEY to join to a FOREIGN KEY (INT) in a parent/child table relationship. Not UPDATING STATISTICS on a table when many records have been INSERTED or DELETED. Not reorganizing (i.e. unloading, droping, re-creating, loading and re-indexing) tables when many rows have been inserted or deleted (some engines physically keep deleted rows in a table with a delete flag.) Not taking advantage of FRAGMENT ON EXPRESSION (if supported) on large tables which have high transaction rates. Choosing the wrong datatype for a column! Not choosing a proper column name. Not adding new columns at the end of the table. Not creating proper indexes to support frequently used queries. creating indexes on columns with few possible values and creating unnecessary indexes. ...more to be added.

不了解数据库并发模型以及它如何影响开发。事后添加索引和调整查询很容易。然而,应用程序的设计没有适当考虑热点,资源争用 和正确的操作(假设您刚刚阅读的内容仍然有效!)可能需要在数据库和应用层中进行重大更改,以便稍后纠正。

非常大的事务,插入/更新大量数据,然后重新加载它。基本上,这是因为没有考虑数据库工作的多用户环境。 过度使用函数,特别是作为select和where子句中的结果,导致函数被反复调用以获取结果。我认为,这符合他们试图以他们更习惯的过程方式工作而不是充分利用SQL的一般情况。

1)不了解如何正确地在Java和数据库之间进行交互。

2)过度解析,不恰当或没有重用SQL

3)没有使用BIND变量

4)在数据库中使用SQL集逻辑时,用Java实现过程逻辑(更好)。

5)在投入生产前没有进行任何合理的性能或可伸缩性测试

6)使用水晶报表,在报表中没有正确设置模式名

7)由于不了解执行计划,使用笛卡尔式产品实现SQL(你甚至看了EXPLAIN计划吗?)

不使用参数化查询。它们在停止SQL注入时非常方便。

这是一个不消毒输入数据的具体例子,在另一个回答中提到过。