我如何在MongoDB中执行SQL Join等效?

例如,假设你有两个集合(用户和评论),我想拉pid=444的所有评论以及每个评论的用户信息。

comments
  { uid:12345, pid:444, comment="blah" }
  { uid:12345, pid:888, comment="asdf" }
  { uid:99999, pid:444, comment="qwer" }

users
  { uid:12345, name:"john" }
  { uid:99999, name:"mia"  }

是否有一种方法可以一次性提取具有特定字段的所有评论(例如. ...find({pid:444}))和与每个评论相关的用户信息?

目前,我首先获取符合条件的注释,然后找出结果集中的所有uid,获取用户对象,并将它们与注释的结果合并。看来我做错了。


你必须按照你描述的方法去做。MongoDB是非关系数据库,不支持连接。


有一个很多驱动程序都支持的规范叫做DBRef。

DBRef是用于在文档之间创建引用的更正式的规范。DBRefs(通常)包括集合名称和对象id。大多数开发人员只在集合可以从一个文档更改到下一个文档时才使用DBRefs。如果您引用的集合总是相同的,那么上面概述的手动引用更有效。

摘自MongoDB文档:数据模型>数据模型参考> 数据库的引用


mongodb官方网站上的这个页面恰好解决了这个问题:

https://mongodb-documentation.readthedocs.io/en/latest/ecosystem/tutorial/model-data-for-ruby-on-rails.html

When we display our list of stories, we'll need to show the name of the user who posted the story. If we were using a relational database, we could perform a join on users and stores, and get all our objects in a single query. But MongoDB does not support joins and so, at times, requires bit of denormalization. Here, this means caching the 'username' attribute. Relational purists may be feeling uneasy already, as if we were violating some universal law. But let’s bear in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design objective. A normalized table provides an atomic, isolated chunk of data. A document, however, more closely represents an object as a whole. In the case of a social news site, it can be argued that a username is intrinsic to the story being posted.


下面是一个“join”* Actors和Movies集合的例子:

https://github.com/mongodb/cookbook/blob/master/content/patterns/pivot.txt

它使用了.mapReduce()方法

join -在面向文档的数据库中加入的替代方案


playORM可以为您使用S-SQL(可伸缩SQL),它只是添加分区,这样您就可以在分区内进行连接。


你可以使用Postgres中的mongo_fdw在MongoDB上运行包括join在内的SQL查询。


这取决于你想做什么。

目前您已经将其设置为规范化数据库,这很好,而且您的操作方式也很合适。

然而,还有其他的方法。

您可以有一个帖子集合,其中为每个帖子嵌入了评论,并引用了您可以迭代查询以获得的用户。您可以将用户名与注释一起存储,您可以将它们都存储在一个文档中。

The thing with NoSQL is it's designed for flexible schemas and very fast reading and writing. In a typical Big Data farm the database is the biggest bottleneck, you have fewer database engines than you do application and front end servers...they're more expensive but more powerful, also hard drive space is very cheap comparatively. Normalization comes from the concept of trying to save space, but it comes with a cost at making your databases perform complicated Joins and verifying the integrity of relationships, performing cascading operations. All of which saves the developers some headaches if they designed the database properly.

With NoSQL, if you accept that redundancy and storage space aren't issues because of their cost (both in processor time required to do updates and hard drive costs to store extra data), denormalizing isn't an issue (for embedded arrays that become hundreds of thousands of items it can be a performance issue, but most of the time that's not a problem). Additionally you'll have several application and front end servers for every database cluster. Have them do the heavy lifting of the joins and let the database servers stick to reading and writing.

TL;DR:你现在做的很好,还有其他的方法。查看mongodb文档的数据模型模式以获得一些很棒的示例。http://docs.mongodb.org/manual/data-modeling/


我们可以使用mongodb客户端控制台在几行中使用一个简单的函数合并/连接一个集合中的所有数据,现在我们可以执行所需的查询。 下面是一个完整的例子,

——作者:

db.authors.insert([
    {
        _id: 'a1',
        name: { first: 'orlando', last: 'becerra' },
        age: 27
    },
    {
        _id: 'a2',
        name: { first: 'mayra', last: 'sanchez' },
        age: 21
    }
]);

——类:

db.categories.insert([
    {
        _id: 'c1',
        name: 'sci-fi'
    },
    {
        _id: 'c2',
        name: 'romance'
    }
]);

——书

db.books.insert([
    {
        _id: 'b1',
        name: 'Groovy Book',
        category: 'c1',
        authors: ['a1']
    },
    {
        _id: 'b2',
        name: 'Java Book',
        category: 'c2',
        authors: ['a1','a2']
    },
]);

-图书借阅

db.lendings.insert([
    {
        _id: 'l1',
        book: 'b1',
        date: new Date('01/01/11'),
        lendingBy: 'jose'
    },
    {
        _id: 'l2',
        book: 'b1',
        date: new Date('02/02/12'),
        lendingBy: 'maria'
    }
]);

-神奇之处:

db.books.find().forEach(
    function (newBook) {
        newBook.category = db.categories.findOne( { "_id": newBook.category } );
        newBook.lendings = db.lendings.find( { "book": newBook._id  } ).toArray();
        newBook.authors = db.authors.find( { "_id": { $in: newBook.authors }  } ).toArray();
        db.booksReloaded.insert(newBook);
    }
);

-获取新的收集数据:

db.booksReloaded.find().pretty()

-回复:)

{
    "_id" : "b1",
    "name" : "Groovy Book",
    "category" : {
        "_id" : "c1",
        "name" : "sci-fi"
    },
    "authors" : [
        {
            "_id" : "a1",
            "name" : {
                "first" : "orlando",
                "last" : "becerra"
            },
            "age" : 27
        }
    ],
    "lendings" : [
        {
            "_id" : "l1",
            "book" : "b1",
            "date" : ISODate("2011-01-01T00:00:00Z"),
            "lendingBy" : "jose"
        },
        {
            "_id" : "l2",
            "book" : "b1",
            "date" : ISODate("2012-02-02T00:00:00Z"),
            "lendingBy" : "maria"
        }
    ]
}
{
    "_id" : "b2",
    "name" : "Java Book",
    "category" : {
        "_id" : "c2",
        "name" : "romance"
    },
    "authors" : [
        {
            "_id" : "a1",
            "name" : {
                "first" : "orlando",
                "last" : "becerra"
            },
            "age" : 27
        },
        {
            "_id" : "a2",
            "name" : {
                "first" : "mayra",
                "last" : "sanchez"
            },
            "age" : 21
        }
    ],
    "lendings" : [ ]
}

希望这句话能帮到你。


我们可以使用mongoDB子查询来合并两个集合。举个例子, 评论,

`db.commentss.insert([
  { uid:12345, pid:444, comment:"blah" },
  { uid:12345, pid:888, comment:"asdf" },
  { uid:99999, pid:444, comment:"qwer" }])`

用户——

db.userss.insert([
  { uid:12345, name:"john" },
  { uid:99999, name:"mia"  }])

MongoDB子查询JOIN——

`db.commentss.find().forEach(
    function (newComments) {
        newComments.userss = db.userss.find( { "uid": newComments.uid } ).toArray();
        db.newCommentUsers.insert(newComments);
    }
);`

从新生成的Collection中获取结果

db.newCommentUsers.find().pretty()

结果——

`{
    "_id" : ObjectId("5511236e29709afa03f226ef"),
    "uid" : 12345,
    "pid" : 444,
    "comment" : "blah",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f2"),
            "uid" : 12345,
            "name" : "john"
        }
    ]
}
{
    "_id" : ObjectId("5511236e29709afa03f226f0"),
    "uid" : 12345,
    "pid" : 888,
    "comment" : "asdf",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f2"),
            "uid" : 12345,
            "name" : "john"
        }
    ]
}
{
    "_id" : ObjectId("5511236e29709afa03f226f1"),
    "uid" : 99999,
    "pid" : 444,
    "comment" : "qwer",
    "userss" : [
        {
            "_id" : ObjectId("5511238129709afa03f226f3"),
            "uid" : 99999,
            "name" : "mia"
        }
    ]
}`

希望这能有所帮助。


MongoDB不允许连接,但是你可以使用插件来处理。检查mongo-join插件。这是最好的,我已经用过了。你可以直接使用npm安装它,就像这个npm install mongo-join。您可以通过示例查看完整的文档。

(++)非常有用的工具,当我们需要加入(N)个集合

(——)我们可以只在查询的顶层应用条件

例子

var Join = require('mongo-join').Join, mongodb = require('mongodb'), Db = mongodb.Db, Server = mongodb.Server;
db.open(function (err, Database) {
    Database.collection('Appoint', function (err, Appoints) {

        /* we can put conditions just on the top level */
        Appoints.find({_id_Doctor: id_doctor ,full_date :{ $gte: start_date },
            full_date :{ $lte: end_date }}, function (err, cursor) {
            var join = new Join(Database).on({
                field: '_id_Doctor', // <- field in Appoints document
                to: '_id',         // <- field in User doc. treated as ObjectID automatically.
                from: 'User'  // <- collection name for User doc
            }).on({
                field: '_id_Patient', // <- field in Appoints doc
                to: '_id',         // <- field in User doc. treated as ObjectID automatically.
                from: 'User'  // <- collection name for User doc
            })
            join.toArray(cursor, function (err, joinedDocs) {

                /* do what ever you want here */
                /* you can fetch the table and apply your own conditions */
                .....
                .....
                .....


                resp.status(200);
                resp.json({
                    "status": 200,
                    "message": "success",
                    "Appoints_Range": joinedDocs,


                });
                return resp;


            });

    });

As others have pointed out you are trying to create a relational database from none relational database which you really don't want to do but anyways, if you have a case that you have to do this here is a solution you can use. We first do a foreach find on collection A( or in your case users) and then we get each item as an object then we use object property (in your case uid) to lookup in our second collection (in your case comments) if we can find it then we have a match and we can print or do something with it. Hope this helps you and good luck :)

db.users.find().forEach(
function (object) {
    var commonInBoth=db.comments.findOne({ "uid": object.uid} );
    if (commonInBoth != null) {
        printjson(commonInBoth) ;
        printjson(object) ;
    }else {
        // did not match so we don't care in this case
    }
});

不,看起来你并没有做错。MongoDB连接是“客户端”。就像你说的

目前,我首先获取符合条件的注释,然后找出结果集中的所有uid,获取用户对象,并将它们与注释的结果合并。看来我做错了。

1) Select from the collection you're interested in.
2) From that collection pull out ID's you need
3) Select from other collections
4) Decorate your original results.

它不是一个“真正的”连接,但它实际上比SQL连接有用得多,因为您不必处理“多”面连接的重复行,而是修饰最初选择的集合。

这一页上有很多废话和FUD。结果5年后,MongoDB仍然存在。


我认为,如果你需要规范化的数据表-你需要尝试一些其他的数据库解决方案。

但是我在Git上找到了MOngo的解决方案 顺便说一下,在插入代码-它有电影的名称,但没有电影的ID。

问题

你有一个演员集合和他们所做的电影数组。

您希望生成一个Movies集合,每个Movies中都包含一个actor数组。

一些示例数据

 db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
 db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });

解决方案

我们需要循环遍历Actor文档中的每个电影,并分别发出每个电影。

这里的问题是在减少阶段。我们不能从reduce阶段发出一个数组,因此必须在返回的“value”文档中构建一个Actors数组。

The code
map = function() {
  for(var i in this.movies){
    key = { movie: this.movies[i] };
    value = { actors: [ this.actor ] };
    emit(key, value);
  }
}

reduce = function(key, values) {
  actor_list = { actors: [] };
  for(var i in values) {
    actor_list.actors = values[i].actors.concat(actor_list.actors);
  }
  return actor_list;
}

注意,actor_list实际上是一个包含数组的javascript对象。还要注意map发出相同的结构。

执行以下命令执行map / reduce,将其输出到“pivot”集合并打印结果:

printjson (db.actors。mapReduce(map, reduce, "pivot")); db.pivot.find () .forEach (printjson);

以下是输出示例,请注意《风月俏佳人》和《逃跑新娘》中都有“理查德·基尔”和“茱莉亚·罗伯茨”。

{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } }
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } }
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }


从Mongo 3.2开始,这个问题的答案大多不再正确。添加到聚合管道中的新的$lookup操作符本质上与左外连接相同:

https://docs.mongodb.org/master/reference/operator/aggregation/lookup/#pipe._S_lookup

从文档中可以看出:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

当然,MongoDB不是一个关系数据库,开发人员正在谨慎地推荐$lookup的特定用例,但至少在3.2中,使用MongoDB进行连接是可能的。


你可以在Mongo中使用3.2版本提供的查找来连接两个集合。在您的情况下,查询将是

db.comments.aggregate({
    $lookup:{
        from:"users",
        localField:"uid",
        foreignField:"uid",
        as:"users_comments"
    }
})

或者你也可以加入关于用户,然后会有一个小的变化如下所示。

db.users.aggregate({
    $lookup:{
        from:"comments",
        localField:"uid",
        foreignField:"uid",
        as:"users_comments"
    }
})

它的工作原理与SQL中的左连接和右连接一样。


在3.2.6之前,Mongodb不像mysql那样支持join查询。下面是适合你的解决方案。

 db.getCollection('comments').aggregate([
        {$match : {pid : 444}},
        {$lookup: {from: "users",localField: "uid",foreignField: "uid",as: "userData"}},
   ])

通过正确组合$lookup, $project和$match,您可以在多个参数上连接多个表。这是因为它们可以被链接多次。

假设我们想做以下(引用)

SELECT S.* FROM LeftTable S
LEFT JOIN RightTable R ON S.ID = R.ID AND S.MID = R.MID  
WHERE R.TIM > 0 AND S.MOB IS NOT NULL

步骤1:链接所有表

您可以根据需要查找任意数量的表。

$lookup -查询中的每个表一个

$unwind -正确地反规格化数据,否则它将被包装在数组中

Python代码. .

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "R"}
                   
                        ])

步骤2:定义所有条件

$project:在这里定义所有的条件语句,加上所有你想选择的变量。

Python代码. .

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "R"},

                        # define conditionals + variables

                        {"$project": {
                          "midEq": {"$eq": ["$MID", "$R.MID"]},
                          "ID": 1, "MOB": 1, "MID": 1
                        }}
                        ])

第三步:连接所有的条件句

$match -使用OR或AND等连接所有条件可以有很多个。

$project:取消所有的条件

完整的Python代码。

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "$R"},

                        # define conditionals + variables

                        {"$project": {
                          "midEq": {"$eq": ["$MID", "$R.MID"]},
                          "ID": 1, "MOB": 1, "MID": 1
                        }},

                        # join all conditionals

                        {"$match": {
                          "$and": [
                            {"R.TIM": {"$gt": 0}}, 
                            {"MOB": {"$exists": True}},
                            {"midEq": {"$eq": True}}
                        ]}},

                        # undefine conditionals

                        {"$project": {
                          "midEq": 0
                        }}

                        ])

几乎任何表、条件和连接的组合都可以用这种方式完成。


您可以使用聚合管道来实现它,但是自己编写它很麻烦。

您可以使用mongo-join-query从您的查询自动创建聚合管道。

这是你的查询的样子:

const mongoose = require("mongoose");
const joinQuery = require("mongo-join-query");

joinQuery(
    mongoose.models.Comment,
    {
        find: { pid:444 },
        populate: ["uid"]
    },
    (err, res) => (err ? console.log("Error:", err) : console.log("Success:", res.results))
);

您的结果将在uid字段中有user对象,您可以链接任意多的层次。您可以填充对用户的引用,从而引用一个Team,再引用其他东西,等等。

免责声明:我编写了mongo-join-query来解决这个问题。


查找美元(聚合)

对同一数据库中的未分片集合执行左外连接,以从“已连接”集合中筛选文档进行处理。$查找阶段向每个输入文档添加一个新的数组字段,其元素是“已加入”集合中的匹配文档。$查找阶段将这些重新塑造的文档传递给下一个阶段。 $查找阶段的语法如下:

平等的比赛

要在输入文档中的字段与" joined "集合中的文档中的字段之间执行相等匹配,$lookup stage的语法如下:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

该操作将对应于以下伪sql语句:

SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
                               FROM <collection to join>
                               WHERE <pipeline> );

蒙哥URL