我使用过很多web应用程序,它们都是由后台复杂程度各异的数据库驱动的。通常,有一个ORM层独立于业务和表示逻辑。这使得单元测试业务逻辑相当简单;事情可以在离散的模块中实现,测试所需的任何数据都可以通过对象模拟来伪造。
但是测试ORM和数据库本身总是充满了问题和妥协。
这些年来,我尝试了一些策略,但没有一个能让我完全满意。
Load a test database with known data. Run tests against the ORM and confirm that the right data comes back. The disadvantage here is that your test DB has to keep up with any schema changes in the application database, and might get out of sync. It also relies on artificial data, and may not expose bugs that occur due to stupid user input. Finally, if the test database is small, it won't reveal inefficiencies like a missing index. (OK, that last one isn't really what unit testing should be used for, but it doesn't hurt.)
Load a copy of the production database and test against that. The problem here is that you may have no idea what's in the production DB at any given time; your tests may need to be rewritten if data changes over time.
有些人指出,这两种策略都依赖于特定的数据,单元测试应该只测试功能。为此,我看到了一些建议:
使用模拟数据库服务器,只检查ORM是否在响应给定方法调用时发送了正确的查询。
您在测试数据库驱动的应用程序时使用了哪些策略?对你来说最有效的方法是什么?
我一直在问这个问题,但我认为没有解决这个问题的灵丹妙药。
我目前所做的是模拟DAO对象,并在内存中保持一个良好的对象集合的表示,这些对象表示可能存在于数据库中的有趣的数据情况。
The main problem I see with that approach is that you're covering only the code that interacts with your DAO layer, but never testing the DAO itself, and in my experience I see that a lot of errors happen on that layer as well. I also keep a few unit tests that run against the database (for the sake of using TDD or quick testing locally), but those tests are never run on my continuous integration server, since we don't keep a database for that purpose and I think tests that run on CI server should be self-contained.
我发现另一种方法非常有趣,但并不总是值得的,因为它有点耗时,那就是在只在单元测试中运行的嵌入式数据库上创建用于生产的相同模式。
尽管毫无疑问,这种方法提高了您的覆盖率,但也有一些缺点,因为您必须尽可能接近ANSI SQL,以使其与当前的DBMS和嵌入式替代品一起工作。
无论您认为哪个项目与您的代码更相关,都有一些项目可以使它更简单,比如DbUnit。
实际上,我用了你的第一种方法,并取得了相当大的成功,但我认为用一种稍微不同的方式可以解决你的一些问题:
Keep the entire schema and scripts for creating it in source control so that anyone can create the current database schema after a check out. In addition, keep sample data in data files that get loaded by part of the build process. As you discover data that causes errors, add it to your sample data to check that errors don't re-emerge.
Use a continuous integration server to build the database schema, load the sample data, and run tests. This is how we keep our test database in sync (rebuilding it at every test run). Though this requires that the CI server have access and ownership of its own dedicated database instance, I say that having our db schema built 3 times a day has dramatically helped find errors that probably would not have been found till just before delivery (if not later). I can't say that I rebuild the schema before every commit. Does anybody? With this approach you won't have to (well maybe we should, but its not a big deal if someone forgets).
For my group, user input is done at the application level (not db) so this is tested via standard unit tests.
正在加载生产数据库副本:
这是我在上一份工作中使用的方法。这是一个巨大的痛苦,因为有几个问题:
副本会比生产版本过时
将对副本的模式进行更改,但不会传播到生产系统。在这一点上,我们有不同的模式。不好玩。
模拟数据库服务器:
我现在的工作也是这样。在每次提交之后,我们对注入了模拟db访问器的应用程序代码执行单元测试。然后我们每天执行三次上面描述的完整的db构建。我绝对推荐这两种方法。