如何从MongoDB获得随机记录?

我想从一个巨大的集合(1亿条记录)中获得一个随机记录。

最快最有效的方法是什么?

数据已经在那里，没有字段可以生成随机数并获得随机行。

当前回答

如果没有数据，这是很困难的。_id字段是什么?它们是mongodb对象id吗?如果是这样，你可以得到最大值和最小值:

lowest = db.coll.find().sort({_id:1}).limit(1).next()._id;
highest = db.coll.find().sort({_id:-1}).limit(1).next()._id;

然后，如果你假设id是均匀分布的(但它们不是，但至少这是一个开始):

unsigned long long L = first_8_bytes_of(lowest)
unsigned long long H = first_8_bytes_of(highest)

V = (H - L) * random_from_0_to_1();
N = L + V;
oid = N concat random_4_bytes();

randomobj = db.coll.find({_id:{$gte:oid}}).limit(1);

2010-05-13 13:48:41

其他回答

在Python中使用pymongo:

import random

def get_random_doc():
    count = collection.count()
    return collection.find()[random.randrange(count)]

2015-01-24 14:38:26

下面是一种使用_id的默认ObjectId值和一些数学和逻辑的方法。

// Get the "min" and "max" timestamp values from the _id in the collection and the 
// diff between.
// 4-bytes from a hex string is 8 characters

var min = parseInt(db.collection.find()
        .sort({ "_id": 1 }).limit(1).toArray()[0]._id.str.substr(0,8),16)*1000,
    max = parseInt(db.collection.find()
        .sort({ "_id": -1 })limit(1).toArray()[0]._id.str.substr(0,8),16)*1000,
    diff = max - min;

// Get a random value from diff and divide/multiply be 1000 for The "_id" precision:
var random = Math.floor(Math.floor(Math.random(diff)*diff)/1000)*1000;

// Use "random" in the range and pad the hex string to a valid ObjectId
var _id = new ObjectId(((min + random)/1000).toString(16) + "0000000000000000")

// Then query for the single document:
var randomDoc = db.collection.find({ "_id": { "$gte": _id } })
   .sort({ "_id": 1 }).limit(1).toArray()[0];

这是shell表示法的一般逻辑，很容易适应。

所以在点上:

查找集合中的最小和最大主键值生成一个位于这些文档的时间戳之间的随机数。将随机数与最小值相加，然后找到大于或等于该值的第一个文档。

这使用了从“十六进制”的时间戳值中“填充”来形成有效的ObjectId值，因为这就是我们正在寻找的。使用整数作为_id值本质上更简单，但在点中基本思想相同。

2015-06-26 11:06:04

为了获得确定数量的无重复的随机文档:

first get all ids get size of documents loop geting random index and skip duplicated number_of_docs=7 db.collection('preguntas').find({},{_id:1}).toArray(function(err, arr) { count=arr.length idsram=[] rans=[] while(number_of_docs!=0){ var R = Math.floor(Math.random() * count); if (rans.indexOf(R) > -1) { continue } else { ans.push(R) idsram.push(arr[R]._id) number_of_docs-- } } db.collection('preguntas').find({}).toArray(function(err1, doc1) { if (err1) { console.log(err1); return; } res.send(doc1) }); });

2015-12-19 20:13:54

有效可靠的方法是:

在每个文档中添加一个名为“random”的字段，并为其分配一个随机值，为该随机字段添加一个索引，如下所示:

让我们假设我们有一个名为“links”的网络链接集合，我们想从它中随机链接:

link = db.links.find().sort({random: 1}).limit(1)[0]

为了确保同一个链接不会第二次弹出，用一个新的随机数更新它的随机场:

db.links.update({random: Math.random()}, link)

2011-03-25 13:56:27

对于我来说，我想以随机顺序获得相同的记录，所以我创建了一个用于排序的空数组，然后生成1到7之间的随机数(我有7个字段)。每次我得到一个不同的值，我分配一个不同的随机排序。这是“外行”，但对我来说很管用。

//generate random number
const randomval = some random value;
//declare sort array and initialize to empty

const sort = [];

//write a conditional if else to get to decide which sort to use

if(randomval == 1)
{


sort.push(...['createdAt',1]);

}

else if(randomval == 2)

{
   sort.push(...['_id',1]);
}

....
else if(randomval == n)
{
   sort.push(...['n',1]);
}

2021-11-06 09:15:57

如何从MongoDB获得随机记录?

推荐文章

最新文章

标签