应用程序开发人员常见的数据库开发错误有哪些?


当前回答

非常大的事务,插入/更新大量数据,然后重新加载它。基本上,这是因为没有考虑数据库工作的多用户环境。 过度使用函数,特别是作为select和where子句中的结果,导致函数被反复调用以获取结果。我认为,这符合他们试图以他们更习惯的过程方式工作而不是充分利用SQL的一般情况。

其他回答

1 -在where子句中的值上不必要地使用函数,导致该索引未被使用。

例子:

where to_char(someDate,'YYYYMMDD') between :fromDate and :toDate

而不是

where someDate >= to_date(:fromDate,'YYYYMMDD') and someDate < to_date(:toDate,'YYYYMMDD')+1

在较小的程度上:不向需要函数索引的值添加函数索引……

2 -没有添加检查约束以确保数据的有效性。查询优化器可以使用约束,它们确实有助于确保您可以信任您的不变量。没有理由不使用它们。

3 -纯粹出于懒惰或时间压力而向表中添加未规范化的列。事情通常不是这样设计的,而是演变成这样的。最终的结果是,当您在未来的演进中受到丢失的数据完整性的困扰时,将会有大量的工作试图清理混乱。

想想看,重新设计一个没有数据的表是非常便宜的。一个有数百万条记录的表,没有完整性……重新设计并不便宜。因此,在创建列或表时执行正确的设计是可以分摊的。

4 -不是关于数据库本身,但确实令人讨厌。不关心SQL的代码质量。SQL是用文本表示的事实并不意味着可以将逻辑隐藏在大量的字符串操作算法中。完全有可能用文本编写SQL,使您的程序员同事能够读懂。

我想补充一点: 偏好“优雅”代码而不是高性能代码。在应用程序开发人员看来,对数据库最有效的代码通常是丑陋的。

Believing that nonsense about premature optimization. Databases must consider performance in the original design and in any subsequent development. Performance is 50% of database design (40% is data integrity and the last 10% is security) in my opinion. Databases which are not built from the bottom up to perform will perform badly once real users and real traffic are placed against the database. Premature optimization doesn't mean no optimization! It doesn't mean you should write code that will almost always perform badly because you find it easier (cursors for example which should never be allowed in a production database unless all else has failed). It means you don't need to look at squeezing out that last little bit of performance until you need to. A lot is known about what will perform better on databases, to ignore this in design and development is short-sighted at best.

相关子查询导致的性能差

大多数情况下,您希望避免相关子查询。如果子查询中存在对外部查询的列的引用,则子查询是相关的。当发生这种情况时,对于返回的每一行至少执行一次子查询,如果在应用包含相关子查询的条件之后应用其他条件,则可以执行更多次。

请原谅这个不自然的示例和Oracle语法,但假设您想要找到自上次商店每天销售额低于10,000美元以来在任何商店中雇用的所有员工。

select e.first_name, e.last_name
from employee e
where e.start_date > 
        (select max(ds.transaction_date)
         from daily_sales ds
         where ds.store_id = e.store_id and
               ds.total < 10000)

本例中的子查询通过store_id与外部查询相关联,并将对系统中的每个员工执行。优化此查询的一种方法是将子查询移动到内联视图。

select e.first_name, e.last_name
from employee e,
     (select ds.store_id,
             max(s.transaction_date) transaction_date
      from daily_sales ds
      where ds.total < 10000
      group by s.store_id) dsx
where e.store_id = dsx.store_id and
      e.start_date > dsx.transaction_date

In this example, the query in the from clause is now an inline-view (again some Oracle specific syntax) and is only executed once. Depending on your data model, this query will probably execute much faster. It would perform better than the first query as the number of employees grew. The first query could actually perform better if there were few employees and many stores (and perhaps many of stores had no employees) and the daily_sales table was indexed on store_id. This is not a likely scenario but shows how a correlated query could possibly perform better than an alternative.

我曾多次看到初级开发人员关联子查询,这通常会对性能产生严重影响。但是,当删除一个相关的子查询时,一定要查看之前和之后的解释计划,以确保您没有使性能变差。

使用一些疯狂的构造和应用逻辑,而不是简单的COALESCE。

忘记在表之间建立关系。我记得当我刚开始在我现在的雇主工作时,我不得不清理这些东西。