我不是数据库专家,也没有正式的计算机科学背景,所以请原谅我。我想知道如果您使用v4之前的旧版本MongoDB(不兼容ACID),会发生哪些现实世界中的负面事情。这适用于任何不符合ACID要求的数据库。

我知道MongoDB可以执行原子操作,但它们不“支持传统的锁定和复杂的事务”,主要是出于性能原因。我也理解数据库事务的重要性,举个例子,当你的数据库是一家银行的,你正在更新几个记录,这些记录都需要同步,如果停电,你希望事务恢复到初始状态,所以信用等于购买,等等。

但是当我开始谈论MongoDB时,我们中那些不知道数据库如何实际实现的技术细节的人开始抛出这样的语句:

MongoDB比MySQL和Postgres快得多,但它“不能正确保存”的几率很小,比如百万分之一。

That "won't save correctly" part is referring to this understanding: If there's a power outage right at the instant you're writing to MongoDB, there's a chance for a particular record (say you're tracking pageviews in documents with 10 attributes each), that one of the documents only saved 5 of the attributes… which means over time your pageview counters are going to be "slightly" off. You'll never know by how much, you know they'll be 99.999% correct, but not 100%. This is because, unless you specifically made this a mongodb atomic operation, the operation is not guaranteed to have been atomic.

所以我的问题是,如何正确解释MongoDB何时以及为什么不能“正确保存”?它不满足ACID的哪些部分,在什么情况下,您如何知道0.001%的数据何时失效?这难道不能解决吗?如果没有,这似乎意味着您不应该在MongoDB中存储用户表之类的东西,因为记录可能无法保存。但话又说回来,1/ 100万用户可能只需要“再试一次注册”,不是吗?

我只是在寻找可能的一个列表,当/为什么负面的事情发生在一个ACID不合规的数据库,如MongoDB,理想情况下,如果有一个标准的解决方案(如运行后台作业来清理数据,或仅使用SQL等)。


当前回答

The only reason atomic modifies work against a single-collection is because the mongodb developers recently exchanged a database lock with a collection wide write-lock. Deciding that the increased concurrency here was worth the trade-off. At it's core, mongodb is a memory-mapped file: they've delegated the buffer-pool management to the machine's vm subsystem. Because it's always in memory, they're able to get away with very course grained locks: you'll be performing in-memory only operations while holding it, which will be extremely fast. This differs significantly from a traditional database system which is sometimes forced to perform I/O while holding a pagelock or a rowlock.

其他回答

“在MongoDB中,单个文档上的操作是原子的”——这是过去的事情

在MongoDB 4.0的新版本中,您可以:

However, for situations that require atomicity for updates to multiple documents or consistency between reads to multiple documents, MongoDB provides the ability to perform multi-document transactions against replica sets. Multi-document transactions can be used across multiple operations, collections, databases, and documents. Multi-document transactions provide an “all-or-nothing” proposition. When a transaction commits, all data changes made in the transaction are saved. If any operation in the transaction fails, the transaction aborts and all data changes made in the transaction are discarded without ever becoming visible. Until a transaction commits, no write operations in the transaction are visible outside the transaction.

尽管对于如何执行和执行什么操作有一些限制。

检查Mongo Doc。 https://docs.mongodb.com/master/core/transactions/

The only reason atomic modifies work against a single-collection is because the mongodb developers recently exchanged a database lock with a collection wide write-lock. Deciding that the increased concurrency here was worth the trade-off. At it's core, mongodb is a memory-mapped file: they've delegated the buffer-pool management to the machine's vm subsystem. Because it's always in memory, they're able to get away with very course grained locks: you'll be performing in-memory only operations while holding it, which will be extremely fast. This differs significantly from a traditional database system which is sometimes forced to perform I/O while holding a pagelock or a rowlock.

在“星巴克不使用两阶段提交”中有一个很好的解释。

这不是关于NoSQL数据库,但它确实说明了一点,有时您可以承担丢失事务或使数据库暂时处于不一致的状态。

我不认为这是需要“修复”的东西。解决办法是使用acid兼容的关系数据库。当NoSQL的行为符合应用程序需求时,您可以选择一个NoSQL替代方案。

如果您的存储支持每个键的线性化和比较和设置(对于MongoDB是这样的),您可以在客户端实现原子多键更新(可序列化的事务)。这种方法在谷歌的Percolator和CockroachDB中使用,但没有什么可以阻止你在MongoDB中使用它。

我已经创建了这种交易的一步一步的可视化。我希望它能帮助你理解它们。

如果您不介意读提交隔离级别,那么有必要看看Peter Bailis的RAMP事务。它们也可以在客户端为MongoDB实现。

我认为其他人已经给出了很好的答案。 然而,我想补充的是,有ACID NOSQL数据库(如http://ravendb.net/)。所以这不仅仅是NOSQL - no ACID vs关系型ACID....的决定