关于使用fs.readdir进行异步目录搜索有什么想法吗?我意识到我们可以引入递归,并调用read目录函数来读取下一个目录,但我有点担心它不是异步的…

什么好主意吗?我已经看了node-walk,它很棒,但它不能像readdir那样只给我数组中的文件。虽然

寻找这样的输出…

['file1.txt', 'file2.txt', 'dir/file3.txt']

当前回答

用递归

var fs = require('fs')
var path = process.cwd()
var files = []

var getFiles = function(path, files){
    fs.readdirSync(path).forEach(function(file){
        var subpath = path + '/' + file;
        if(fs.lstatSync(subpath).isDirectory()){
            getFiles(subpath, files);
        } else {
            files.push(path + '/' + file);
        }
    });     
}

调用

getFiles(path, files)
console.log(files) // will log all files in directory

其他回答

因为每个人都应该写自己的,所以我写了一个。

步行(dir, cb, endCb) cb(文件) 零endCb (err |)

module.exports = walk;

function walk(dir, cb, endCb) {
  var fs = require('fs');
  var path = require('path');

  fs.readdir(dir, function(err, files) {
    if (err) {
      return endCb(err);
    }

    var pending = files.length;
    if (pending === 0) {
      endCb(null);
    }
    files.forEach(function(file) {
      fs.stat(path.join(dir, file), function(err, stats) {
        if (err) {
          return endCb(err)
        }

        if (stats.isDirectory()) {
          walk(path.join(dir, file), cb, function() {
            pending--;
            if (pending === 0) {
              endCb(null);
            }
          });
        } else {
          cb(path.join(dir, file));
          pending--;
          if (pending === 0) {
            endCb(null);
          }
        }
      })
    });

  });
}

只是简单的散步

let pending = [baseFolderPath]
function walk () {
    pending.shift();
    // do stuffs width pending[0] and change pending items
    if (pending[0]) walk(pending[0])
}
walk(pending[0])

There are basically two ways of accomplishing this. In an async environment you'll notice that there are two kinds of loops: serial and parallel. A serial loop waits for one iteration to complete before it moves onto the next iteration - this guarantees that every iteration of the loop completes in order. In a parallel loop, all the iterations are started at the same time, and one may complete before another, however, it is much faster than a serial loop. So in this case, it's probably better to use a parallel loop because it doesn't matter what order the walk completes in, just as long as it completes and returns the results (unless you want them in order).

一个平行循环看起来是这样的:

var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
  var results = [];
  fs.readdir(dir, function(err, list) {
    if (err) return done(err);
    var pending = list.length;
    if (!pending) return done(null, results);
    list.forEach(function(file) {
      file = path.resolve(dir, file);
      fs.stat(file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          walk(file, function(err, res) {
            results = results.concat(res);
            if (!--pending) done(null, results);
          });
        } else {
          results.push(file);
          if (!--pending) done(null, results);
        }
      });
    });
  });
};

一个串行循环看起来像这样:

var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
  var results = [];
  fs.readdir(dir, function(err, list) {
    if (err) return done(err);
    var i = 0;
    (function next() {
      var file = list[i++];
      if (!file) return done(null, results);
      file = path.resolve(dir, file);
      fs.stat(file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          walk(file, function(err, res) {
            results = results.concat(res);
            next();
          });
        } else {
          results.push(file);
          next();
        }
      });
    })();
  });
};

并且在你的主目录中测试它(警告:如果你的主目录中有很多东西,结果列表将会非常大):

walk(process.env.HOME, function(err, results) {
  if (err) throw err;
  console.log(results);
});

编辑:改进的示例。

另一个很好的npm包是glob。

npm公司

它非常强大,应该能满足你所有的递归需求。

编辑:

实际上我对glob不是很满意,所以我创建了readdirp。

我非常有信心,它的API使得递归地查找文件和目录以及应用特定的过滤器非常容易。

阅读它的文档,以更好地了解它的功能和安装方式:

NPM安装readdirp

这是另一个实现。上述解决方案都没有任何限制,因此如果您的目录结构很大,它们都会崩溃并最终耗尽资源。

var async = require('async');
var fs = require('fs');
var resolve = require('path').resolve;

var scan = function(path, concurrency, callback) {
    var list = [];

    var walker = async.queue(function(path, callback) {
        fs.stat(path, function(err, stats) {
            if (err) {
                return callback(err);
            } else {
                if (stats.isDirectory()) {
                    fs.readdir(path, function(err, files) {
                        if (err) {
                            callback(err);
                        } else {
                            for (var i = 0; i < files.length; i++) {
                                walker.push(resolve(path, files[i]));
                            }
                            callback();
                        }
                    });
                } else {
                    list.push(path);
                    callback();
                }
            }
        });
    }, concurrency);

    walker.push(path);

    walker.drain = function() {
        callback(list);
    }
};

使用50的并发工作得非常好,并且几乎和小型目录结构的简单实现一样快。