我如何(在MongoDB)结合数据从多个集合到一个集合?
我可以使用地图减少,如果是,然后如何?
我非常感谢一些例子,因为我是一个新手。
我如何(在MongoDB)结合数据从多个集合到一个集合?
我可以使用地图减少,如果是,然后如何?
我非常感谢一些例子,因为我是一个新手。
当前回答
Mongorestore有这样一个特性,即在数据库中已经存在的数据之上追加数据,所以这个行为可以用于组合两个集合:
mongodump文物 collection2.rename(文物) mongorestore
还没有尝试过,但它可能比map/reduce方法执行得更快。
其他回答
非常基本的$lookup示例。
db.getCollection('users').aggregate([
{
$lookup: {
from: "userinfo",
localField: "userId",
foreignField: "userId",
as: "userInfoData"
}
},
{
$lookup: {
from: "userrole",
localField: "userId",
foreignField: "userId",
as: "userRoleData"
}
},
{ $unwind: { path: "$userInfoData", preserveNullAndEmptyArrays: true }},
{ $unwind: { path: "$userRoleData", preserveNullAndEmptyArrays: true }}
])
这里用到了
{ $unwind: { path: "$userInfoData", preserveNullAndEmptyArrays: true }},
{ $unwind: { path: "$userRoleData", preserveNullAndEmptyArrays: true }}
而不是
{ $unwind:"$userRoleData"}
{ $unwind:"$userRoleData"}
因为{$unwind:"$userRoleData"}如果在$lookup中没有找到匹配的记录,将返回空或0结果。
在MongoDB中以“SQL UNION”的方式进行联合,可以在单个查询中使用聚合和查找。下面是我测试过的MongoDB 4.0的一个例子:
// Create employees data for testing the union.
db.getCollection('employees').insert({ name: "John", type: "employee", department: "sales" });
db.getCollection('employees').insert({ name: "Martha", type: "employee", department: "accounting" });
db.getCollection('employees').insert({ name: "Amy", type: "employee", department: "warehouse" });
db.getCollection('employees').insert({ name: "Mike", type: "employee", department: "warehouse" });
// Create freelancers data for testing the union.
db.getCollection('freelancers').insert({ name: "Stephany", type: "freelancer", department: "accounting" });
db.getCollection('freelancers').insert({ name: "Martin", type: "freelancer", department: "sales" });
db.getCollection('freelancers').insert({ name: "Doug", type: "freelancer", department: "warehouse" });
db.getCollection('freelancers').insert({ name: "Brenda", type: "freelancer", department: "sales" });
// Here we do a union of the employees and freelancers using a single aggregation query.
db.getCollection('freelancers').aggregate( // 1. Use any collection containing at least one document.
[
{ $limit: 1 }, // 2. Keep only one document of the collection.
{ $project: { _id: '$$REMOVE' } }, // 3. Remove everything from the document.
// 4. Lookup collections to union together.
{ $lookup: { from: 'employees', pipeline: [{ $match: { department: 'sales' } }], as: 'employees' } },
{ $lookup: { from: 'freelancers', pipeline: [{ $match: { department: 'sales' } }], as: 'freelancers' } },
// 5. Union the collections together with a projection.
{ $project: { union: { $concatArrays: ["$employees", "$freelancers"] } } },
// 6. Unwind and replace root so you end up with a result set.
{ $unwind: '$union' },
{ $replaceRoot: { newRoot: '$union' } }
]);
下面是它的工作原理:
Instantiate an aggregate out of any collection of your database that has at least one document in it. If you can't guarantee any collection of your database will not be empty, you can workaround this issue by creating in your database some sort of 'dummy' collection containing a single empty document in it that will be there specifically for doing union queries. Make the first stage of your pipeline to be { $limit: 1 }. This will strip all the documents of the collection except the first one. Strip all the fields of the remaining document by using a $project stage: { $project: { _id: '$$REMOVE' } } Your aggregate now contains a single, empty document. It's time to add lookups for each collection you want to union together. You may use the pipeline field to do some specific filtering, or leave localField and foreignField as null to match the whole collection. { $lookup: { from: 'collectionToUnion1', pipeline: [...], as: 'Collection1' } }, { $lookup: { from: 'collectionToUnion2', pipeline: [...], as: 'Collection2' } }, { $lookup: { from: 'collectionToUnion3', pipeline: [...], as: 'Collection3' } } You now have an aggregate containing a single document that contains 3 arrays like this: { Collection1: [...], Collection2: [...], Collection3: [...] } You can then merge them together into a single array using a $project stage along with the $concatArrays aggregation operator: { "$project" : { "Union" : { $concatArrays: ["$Collection1", "$Collection2", "$Collection3"] } } } You now have an aggregate containing a single document, into which is located an array that contains your union of collections. What remains to be done is to add an $unwind and a $replaceRoot stage to split your array into separate documents: { $unwind: "$Union" }, { $replaceRoot: { newRoot: "$Union" } } Voilà. You now have a result set containing the collections you wanted to union together. You can then add more stages to filter it further, sort it, apply skip() and limit(). Pretty much anything you want.
从Mongo 4.4开始,我们可以通过将新的$unionWith聚合阶段与$group的新的$accumulator操作符耦合在聚合管道中实现这种连接:
// > db.users.find()
// [{ user: 1, name: "x" }, { user: 2, name: "y" }]
// > db.books.find()
// [{ user: 1, book: "a" }, { user: 1, book: "b" }, { user: 2, book: "c" }]
// > db.movies.find()
// [{ user: 1, movie: "g" }, { user: 2, movie: "h" }, { user: 2, movie: "i" }]
db.users.aggregate([
{ $unionWith: "books" },
{ $unionWith: "movies" },
{ $group: {
_id: "$user",
user: {
$accumulator: {
accumulateArgs: ["$name", "$book", "$movie"],
init: function() { return { books: [], movies: [] } },
accumulate: function(user, name, book, movie) {
if (name) user.name = name;
if (book) user.books.push(book);
if (movie) user.movies.push(movie);
return user;
},
merge: function(userV1, userV2) {
if (userV2.name) userV1.name = userV2.name;
userV1.books.concat(userV2.books);
userV1.movies.concat(userV2.movies);
return userV1;
},
lang: "js"
}
}
}}
])
// { _id: 1, user: { books: ["a", "b"], movies: ["g"], name: "x" } }
// { _id: 2, user: { books: ["c"], movies: ["h", "i"], name: "y" } }
$unionWith combines records from the given collection within documents already in the aggregation pipeline. After the 2 union stages, we thus have all users, books and movies records within the pipeline. We then $group records by $user and accumulate items using the $accumulator operator allowing custom accumulations of documents as they get grouped: the fields we're interested in accumulating are defined with accumulateArgs. init defines the state that will be accumulated as we group elements. the accumulate function allows performing a custom action with a record being grouped in order to build the accumulated state. For instance, if the item being grouped has the book field defined, then we update the books part of the state. merge is used to merge two internal states. It's only used for aggregations running on sharded clusters or when the operation exceeds memory limits.
MongoDB 3.2现在允许通过$lookup聚合阶段将多个集合中的数据合并为一个集合。作为一个实际的例子,假设您有关于书籍的数据,分为两个不同的集合。
第一种藏书,称为书籍,有以下资料:
{
"isbn": "978-3-16-148410-0",
"title": "Some cool book",
"author": "John Doe"
}
{
"isbn": "978-3-16-148999-9",
"title": "Another awesome book",
"author": "Jane Roe"
}
第二个集合名为books_selling_data,包含以下数据:
{
"_id": ObjectId("56e31bcf76cdf52e541d9d26"),
"isbn": "978-3-16-148410-0",
"copies_sold": 12500
}
{
"_id": ObjectId("56e31ce076cdf52e541d9d28"),
"isbn": "978-3-16-148999-9",
"copies_sold": 720050
}
{
"_id": ObjectId("56e31ce076cdf52e541d9d29"),
"isbn": "978-3-16-148999-9",
"copies_sold": 1000
}
要合并这两个集合,只需按以下方式使用$lookup:
db.books.aggregate([{
$lookup: {
from: "books_selling_data",
localField: "isbn",
foreignField: "isbn",
as: "copies_sold"
}
}])
在此聚合之后,图书集合将看起来如下所示:
{
"isbn": "978-3-16-148410-0",
"title": "Some cool book",
"author": "John Doe",
"copies_sold": [
{
"_id": ObjectId("56e31bcf76cdf52e541d9d26"),
"isbn": "978-3-16-148410-0",
"copies_sold": 12500
}
]
}
{
"isbn": "978-3-16-148999-9",
"title": "Another awesome book",
"author": "Jane Roe",
"copies_sold": [
{
"_id": ObjectId("56e31ce076cdf52e541d9d28"),
"isbn": "978-3-16-148999-9",
"copies_sold": 720050
},
{
"_id": ObjectId("56e31ce076cdf52e541d9d28"),
"isbn": "978-3-16-148999-9",
"copies_sold": 1000
}
]
}
有几件事值得注意:
“from”集合(在本例中为books_selling_data)不能被分片。 “as”字段将是一个数组,如上面的例子所示。 $lookup stage上的"localField"和"foreignField"选项如果在各自的集合中不存在,出于匹配目的将被视为空($lookup文档中有一个完美的例子)。
因此,作为结论,如果您想合并两个集合,在这种情况下,有一个平坦的copies_sold字段,其中包含售出的总副本,您将不得不多做一些工作,可能会使用一个中间集合,然后$out到最终集合。
你必须在你的应用层做这些。如果您正在使用ORM,它可以使用注释(或类似的东西)来提取存在于其他集合中的引用。我只使用过Morphia, @Reference注释在查询时获取被引用的实体,因此我能够避免自己在代码中这样做。