我如何在MongoDB中执行SQL Join等效?

例如,假设你有两个集合(用户和评论),我想拉pid=444的所有评论以及每个评论的用户信息。

comments
  { uid:12345, pid:444, comment="blah" }
  { uid:12345, pid:888, comment="asdf" }
  { uid:99999, pid:444, comment="qwer" }

users
  { uid:12345, name:"john" }
  { uid:99999, name:"mia"  }

是否有一种方法可以一次性提取具有特定字段的所有评论(例如. ...find({pid:444}))和与每个评论相关的用户信息?

目前,我首先获取符合条件的注释,然后找出结果集中的所有uid,获取用户对象,并将它们与注释的结果合并。看来我做错了。


当前回答

mongodb官方网站上的这个页面恰好解决了这个问题:

https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/model-data-for-ruby-on-rails.html

When we display our list of stories, we'll need to show the name of the user who posted the story. If we were using a relational database, we could perform a join on users and stores, and get all our objects in a single query. But MongoDB does not support joins and so, at times, requires bit of denormalization. Here, this means caching the 'username' attribute. Relational purists may be feeling uneasy already, as if we were violating some universal law. But let’s bear in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design objective. A normalized table provides an atomic, isolated chunk of data. A document, however, more closely represents an object as a whole. In the case of a social news site, it can be argued that a username is intrinsic to the story being posted.

其他回答

这取决于你想做什么。

目前您已经将其设置为规范化数据库,这很好,而且您的操作方式也很合适。

然而,还有其他的方法。

您可以有一个帖子集合,其中为每个帖子嵌入了评论,并引用了您可以迭代查询以获得的用户。您可以将用户名与注释一起存储,您可以将它们都存储在一个文档中。

The thing with NoSQL is it's designed for flexible schemas and very fast reading and writing. In a typical Big Data farm the database is the biggest bottleneck, you have fewer database engines than you do application and front end servers...they're more expensive but more powerful, also hard drive space is very cheap comparatively. Normalization comes from the concept of trying to save space, but it comes with a cost at making your databases perform complicated Joins and verifying the integrity of relationships, performing cascading operations. All of which saves the developers some headaches if they designed the database properly.

With NoSQL, if you accept that redundancy and storage space aren't issues because of their cost (both in processor time required to do updates and hard drive costs to store extra data), denormalizing isn't an issue (for embedded arrays that become hundreds of thousands of items it can be a performance issue, but most of the time that's not a problem). Additionally you'll have several application and front end servers for every database cluster. Have them do the heavy lifting of the joins and let the database servers stick to reading and writing.

TL;DR:你现在做的很好,还有其他的方法。查看mongodb文档的数据模型模式以获得一些很棒的示例。http://docs.mongodb.org/manual/data-modeling/

mongodb官方网站上的这个页面恰好解决了这个问题:

https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/model-data-for-ruby-on-rails.html

When we display our list of stories, we'll need to show the name of the user who posted the story. If we were using a relational database, we could perform a join on users and stores, and get all our objects in a single query. But MongoDB does not support joins and so, at times, requires bit of denormalization. Here, this means caching the 'username' attribute. Relational purists may be feeling uneasy already, as if we were violating some universal law. But let’s bear in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design objective. A normalized table provides an atomic, isolated chunk of data. A document, however, more closely represents an object as a whole. In the case of a social news site, it can be argued that a username is intrinsic to the story being posted.

从Mongo 3.2开始,这个问题的答案大多不再正确。添加到聚合管道中的新的$lookup操作符本质上与左外连接相同:

https://docs.mongodb.org/master/reference/operator/aggregation/lookup/#pipe._S_lookup

从文档中可以看出:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

当然,MongoDB不是一个关系数据库,开发人员正在谨慎地推荐$lookup的特定用例,但至少在3.2中,使用MongoDB进行连接是可能的。

您可以使用聚合管道来实现它,但是自己编写它很麻烦。

您可以使用mongo-join-query从您的查询自动创建聚合管道。

这是你的查询的样子:

const mongoose = require("mongoose");
const joinQuery = require("mongo-join-query");

joinQuery(
    mongoose.models.Comment,
    {
        find: { pid:444 },
        populate: ["uid"]
    },
    (err, res) => (err ? console.log("Error:", err) : console.log("Success:", res.results))
);

您的结果将在uid字段中有user对象,您可以链接任意多的层次。您可以填充对用户的引用,从而引用一个Team,再引用其他东西,等等。

免责声明:我编写了mongo-join-query来解决这个问题。

我们可以使用mongoDB子查询来合并两个集合。举个例子, 评论,

`db.commentss.insert([
  { uid:12345, pid:444, comment:"blah" },
  { uid:12345, pid:888, comment:"asdf" },
  { uid:99999, pid:444, comment:"qwer" }])`

用户——

db.userss.insert([
  { uid:12345, name:"john" },
  { uid:99999, name:"mia"  }])

MongoDB子查询JOIN——

`db.commentss.find().forEach(
    function (newComments) {
        newComments.userss = db.userss.find( { "uid": newComments.uid } ).toArray();
        db.newCommentUsers.insert(newComments);
    }
);`

从新生成的Collection中获取结果

db.newCommentUsers.find().pretty()

结果——

`{
    "_id" : ObjectId("5511236e29709afa03f226ef"),
    "uid" : 12345,
    "pid" : 444,
    "comment" : "blah",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f2"),
            "uid" : 12345,
            "name" : "john"
        }
    ]
}
{
    "_id" : ObjectId("5511236e29709afa03f226f0"),
    "uid" : 12345,
    "pid" : 888,
    "comment" : "asdf",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f2"),
            "uid" : 12345,
            "name" : "john"
        }
    ]
}
{
    "_id" : ObjectId("5511236e29709afa03f226f1"),
    "uid" : 99999,
    "pid" : 444,
    "comment" : "qwer",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f3"),
            "uid" : 99999,
            "name" : "mia"
        }
    ]
}`

希望这能有所帮助。