应用程序开发人员常见的数据库开发错误有哪些?


当前回答

有一件事我想补充,学习使用分析函数,如分区BY, RANK, DENSE_RANK (Oracle)。它们对于复杂的查询是绝对必要的。

其他建议是,如果可能的话,在您的开发团队中有一个专门的数据库开发人员,他应该是SQL、数据库建模、调优等方面的专家(但不是DBA)。这种技能是一笔巨大的财富。

其他回答

许多开发人员倾向于对数据库执行多个查询(通常查询一个或两个表),提取结果并在java/c/c++中执行简单的操作——所有这些都可以用一条SQL语句完成。

许多开发人员通常没有意识到,在开发环境中,数据库和应用程序服务器在他们的笔记本电脑上——但在生产环境中,数据库和应用程序服务器将在不同的机器上。因此,对于每个查询,在应用程序服务器和数据库服务器之间传递的数据都有额外的n/w开销。我惊奇地发现,为了向用户呈现一个页面,应用程序服务器对数据库服务器进行了大量的数据库调用!

这里有一个视频链接,名为“经典数据库开发错误和克服它们的五种方法”,作者是Scott Walz

忘记在表之间建立关系。我记得当我刚开始在我现在的雇主工作时,我不得不清理这些东西。

I think the biggest mistakes that all developers and DBAs do is believing too much on conventions. What I mean by that is that convention are only guide lines that for most cases will work but not necessarily always. I great example is normalization and foreign keys, I know most people wont like this, but normalization can cause complexity and cause loss of performance as well, so if there is no reason to move a phone number to a phones table, don't do it. On the foreign keys, they are great for most cases, but if you are trying to create something that can work by it self when needed the foreign key will be a problem in the future, and also you loose performance. Anyways, as I sad rules and conventions are there to guide, and they should always be though of but not necessarily implemented, analysis of each case is what should always be done.

相关子查询导致的性能差

大多数情况下,您希望避免相关子查询。如果子查询中存在对外部查询的列的引用,则子查询是相关的。当发生这种情况时,对于返回的每一行至少执行一次子查询,如果在应用包含相关子查询的条件之后应用其他条件,则可以执行更多次。

请原谅这个不自然的示例和Oracle语法,但假设您想要找到自上次商店每天销售额低于10,000美元以来在任何商店中雇用的所有员工。

select e.first_name, e.last_name
from employee e
where e.start_date > 
        (select max(ds.transaction_date)
         from daily_sales ds
         where ds.store_id = e.store_id and
               ds.total < 10000)

本例中的子查询通过store_id与外部查询相关联,并将对系统中的每个员工执行。优化此查询的一种方法是将子查询移动到内联视图。

select e.first_name, e.last_name
from employee e,
     (select ds.store_id,
             max(s.transaction_date) transaction_date
      from daily_sales ds
      where ds.total < 10000
      group by s.store_id) dsx
where e.store_id = dsx.store_id and
      e.start_date > dsx.transaction_date

In this example, the query in the from clause is now an inline-view (again some Oracle specific syntax) and is only executed once. Depending on your data model, this query will probably execute much faster. It would perform better than the first query as the number of employees grew. The first query could actually perform better if there were few employees and many stores (and perhaps many of stores had no employees) and the daily_sales table was indexed on store_id. This is not a likely scenario but shows how a correlated query could possibly perform better than an alternative.

我曾多次看到初级开发人员关联子查询,这通常会对性能产生严重影响。但是,当删除一个相关的子查询时,一定要查看之前和之后的解释计划,以确保您没有使性能变差。