关于使用fs.readdir进行异步目录搜索有什么想法吗?我意识到我们可以引入递归,并调用read目录函数来读取下一个目录,但我有点担心它不是异步的…
什么好主意吗?我已经看了node-walk,它很棒,但它不能像readdir那样只给我数组中的文件。虽然
寻找这样的输出…
['file1.txt', 'file2.txt', 'dir/file3.txt']
关于使用fs.readdir进行异步目录搜索有什么想法吗?我意识到我们可以引入递归,并调用read目录函数来读取下一个目录,但我有点担心它不是异步的…
什么好主意吗?我已经看了node-walk,它很棒,但它不能像readdir那样只给我数组中的文件。虽然
寻找这样的输出…
['file1.txt', 'file2.txt', 'dir/file3.txt']
当前回答
这是另一个实现。上述解决方案都没有任何限制,因此如果您的目录结构很大,它们都会崩溃并最终耗尽资源。
var async = require('async');
var fs = require('fs');
var resolve = require('path').resolve;
var scan = function(path, concurrency, callback) {
var list = [];
var walker = async.queue(function(path, callback) {
fs.stat(path, function(err, stats) {
if (err) {
return callback(err);
} else {
if (stats.isDirectory()) {
fs.readdir(path, function(err, files) {
if (err) {
callback(err);
} else {
for (var i = 0; i < files.length; i++) {
walker.push(resolve(path, files[i]));
}
callback();
}
});
} else {
list.push(path);
callback();
}
}
});
}, concurrency);
walker.push(path);
walker.drain = function() {
callback(list);
}
};
使用50的并发工作得非常好,并且几乎和小型目录结构的简单实现一样快。
其他回答
还有一种方法。我把它放在这里。也许将来它会对某人有用。
const fs = require("fs");
const { promisify } = require("util");
const p = require("path");
const readdir = promisify(fs.readdir);
async function getFiles(path) {
try {
const entries = await readdir(path, { withFileTypes: true });
const files = entries
.filter((file) => !file.isDirectory())
.map((file) => ({
path: `${path}/${file.name}`,
ext: p.extname(`${path}/${file.name}`),
pathDir: path,
}));
const folders = entries.filter((folder) => folder.isDirectory());
for (const folder of folders) {
files.push(...(await getFiles(`${path}/${folder.name}`)));
}
return files;
} catch (error) {
return error;
}
}
用法:
getFiles(rootFolderPath)
.then()
.catch()
另一个简单而有用的方法
function walkDir(root) {
const stat = fs.statSync(root);
if (stat.isDirectory()) {
const dirs = fs.readdirSync(root).filter(item => !item.startsWith('.'));
let results = dirs.map(sub => walkDir(`${root}/${sub}`));
return [].concat(...results);
} else {
return root;
}
}
递归-readdir模块具有此功能。
There are basically two ways of accomplishing this. In an async environment you'll notice that there are two kinds of loops: serial and parallel. A serial loop waits for one iteration to complete before it moves onto the next iteration - this guarantees that every iteration of the loop completes in order. In a parallel loop, all the iterations are started at the same time, and one may complete before another, however, it is much faster than a serial loop. So in this case, it's probably better to use a parallel loop because it doesn't matter what order the walk completes in, just as long as it completes and returns the results (unless you want them in order).
一个平行循环看起来是这样的:
var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
var results = [];
fs.readdir(dir, function(err, list) {
if (err) return done(err);
var pending = list.length;
if (!pending) return done(null, results);
list.forEach(function(file) {
file = path.resolve(dir, file);
fs.stat(file, function(err, stat) {
if (stat && stat.isDirectory()) {
walk(file, function(err, res) {
results = results.concat(res);
if (!--pending) done(null, results);
});
} else {
results.push(file);
if (!--pending) done(null, results);
}
});
});
});
};
一个串行循环看起来像这样:
var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
var results = [];
fs.readdir(dir, function(err, list) {
if (err) return done(err);
var i = 0;
(function next() {
var file = list[i++];
if (!file) return done(null, results);
file = path.resolve(dir, file);
fs.stat(file, function(err, stat) {
if (stat && stat.isDirectory()) {
walk(file, function(err, res) {
results = results.concat(res);
next();
});
} else {
results.push(file);
next();
}
});
})();
});
};
并且在你的主目录中测试它(警告:如果你的主目录中有很多东西,结果列表将会非常大):
walk(process.env.HOME, function(err, results) {
if (err) throw err;
console.log(results);
});
编辑:改进的示例。
有一个名为cup-readdir的新模块,可以快速递归地搜索目录。它使用异步承诺,在处理深层目录结构时性能优于许多流行的模块。
它可以返回数组中的所有文件,并根据它们的属性对它们进行排序,但缺乏文件过滤和进入符号链接目录等功能。这对于只想从目录中获取每个文件的大型项目非常有用。这里是他们项目主页的链接。