我想获得MongoDB集合中所有键的名称。

例如,从这个:

db.things.insert( { type : ['dog', 'cat'] } );
db.things.insert( { egg : ['cat'] } );
db.things.insert( { type : [] } );
db.things.insert( { hello : []  } );

我想获得唯一的键:

type, egg, hello

当前回答

以Kristina的回答为灵感,我创建了一个名为Variety的开源工具,它的功能就是:https://github.com/variety/variety

其他回答

要获得所有键减去_id的列表,可以考虑运行以下聚合管道:

var keys = db.collection.aggregate([
    { "$project": {
       "hashmaps": { "$objectToArray": "$$ROOT" } 
    } }, 
    { "$group": {
        "_id": null,
        "fields": { "$addToSet": "$hashmaps.k" }
    } },
    { "$project": {
            "keys": {
                "$setDifference": [
                    {
                        "$reduce": {
                            "input": "$fields",
                            "initialValue": [],
                            "in": { "$setUnion" : ["$$value", "$$this"] }
                        }
                    },
                    ["_id"]
                ]
            }
        }
    }
]).toArray()[0]["keys"];

这对我来说很有效:

var arrayOfFieldNames = [];

var items = db.NAMECOLLECTION.find();

while(items.hasNext()) {
  var item = items.next();
  for(var index in item) {
    arrayOfFieldNames[index] = index;
   }
}

for (var index in arrayOfFieldNames) {
  print(index);
}

我知道我来晚了,但如果你想在python中快速找到所有键(甚至嵌套的键),你可以用递归函数来做:

def get_keys(dl, keys=None):
    keys = keys or []
    if isinstance(dl, dict):
        keys += dl.keys()
        list(map(lambda x: get_keys(x, keys), dl.values()))
    elif isinstance(dl, list):
        list(map(lambda x: get_keys(x, keys), dl))
    return list(set(keys))

像这样使用它:

dl = db.things.find_one({})
get_keys(dl)

如果你的文件没有相同的密钥,你可以这样做:

dl = db.things.find({})
list(set(list(map(get_keys, dl))[0]))

但是这个解决方案肯定是可以优化的。

一般来说,这个解决方案基本上是解决在嵌套字典中查找键,所以这不是mongodb特定的。

使用python。返回集合中所有顶级键的集合:

#Using pymongo and connection named 'db'

reduce(
    lambda all_keys, rec_keys: all_keys | set(rec_keys), 
    map(lambda d: d.keys(), db.things.find()), 
    set()
)

基于@Wolkenarchitekt的回答:https://stackoverflow.com/a/48117846/8808983,我写了一个脚本,可以在db中找到所有键的模式,我认为它可以帮助其他人阅读这个线程:

"""
Python 3
This script get list of patterns and print the collections that contains fields with this patterns.
"""

import argparse

import pymongo
from bson import Code


# initialize mongo connection:
def get_db():
    client = pymongo.MongoClient("172.17.0.2")
    db = client["Data"]
    return db


def get_commandline_options():
    description = "To run use: python db_fields_pattern_finder.py -p <list_of_patterns>"
    parser = argparse.ArgumentParser(description=description)
    parser.add_argument('-p', '--patterns', nargs="+", help='List of patterns to look for in the db.', required=True)
    return parser.parse_args()


def report_matching_fields(relevant_fields_by_collection):
    print("Matches:")

    for collection_name in relevant_fields_by_collection:
        if relevant_fields_by_collection[collection_name]:
            print(f"{collection_name}: {relevant_fields_by_collection[collection_name]}")

    # pprint(relevant_fields_by_collection)


def get_collections_names(db):
    """
    :param pymongo.database.Database db:
    :return list: collections names
    """
    return db.list_collection_names()


def get_keys(db, collection):
    """
    See: https://stackoverflow.com/a/48117846/8808983
    :param db:
    :param collection:
    :return:
    """
    map = Code("function() { for (var key in this) { emit(key, null); } }")
    reduce = Code("function(key, stuff) { return null; }")
    result = db[collection].map_reduce(map, reduce, "myresults")
    return result.distinct('_id')


def get_fields(db, collection_names):
    fields_by_collections = {}
    for collection_name in collection_names:
        fields_by_collections[collection_name] = get_keys(db, collection_name)
    return fields_by_collections


def get_matches_fields(fields_by_collections, patterns):
    relevant_fields_by_collection = {}
    for collection_name in fields_by_collections:
        relevant_fields = [field for field in fields_by_collections[collection_name] if
                           [pattern for pattern in patterns if
                            pattern in field]]
        relevant_fields_by_collection[collection_name] = relevant_fields

    return relevant_fields_by_collection


def main(patterns):
    """
    :param list patterns: List of strings to look for in the db.
    """
    db = get_db()

    collection_names = get_collections_names(db)
    fields_by_collections = get_fields(db, collection_names)
    relevant_fields_by_collection = get_matches_fields(fields_by_collections, patterns)

    report_matching_fields(relevant_fields_by_collection)


if __name__ == '__main__':
    args = get_commandline_options()
    main(args.patterns)