关于使用fs.readdir进行异步目录搜索有什么想法吗?我意识到我们可以引入递归,并调用read目录函数来读取下一个目录,但我有点担心它不是异步的…

什么好主意吗?我已经看了node-walk,它很棒,但它不能像readdir那样只给我数组中的文件。虽然

寻找这样的输出…

['file1.txt', 'file2.txt', 'dir/file3.txt']

当前回答

因为每个人都应该写自己的,所以我写了一个。

步行(dir, cb, endCb) cb(文件) 零endCb (err |)

module.exports = walk;

function walk(dir, cb, endCb) {
  var fs = require('fs');
  var path = require('path');

  fs.readdir(dir, function(err, files) {
    if (err) {
      return endCb(err);
    }

    var pending = files.length;
    if (pending === 0) {
      endCb(null);
    }
    files.forEach(function(file) {
      fs.stat(path.join(dir, file), function(err, stats) {
        if (err) {
          return endCb(err)
        }

        if (stats.isDirectory()) {
          walk(path.join(dir, file), cb, function() {
            pending--;
            if (pending === 0) {
              endCb(null);
            }
          });
        } else {
          cb(path.join(dir, file));
          pending--;
          if (pending === 0) {
            endCb(null);
          }
        }
      })
    });

  });
}

其他回答

因为每个人都应该写自己的,所以我写了一个。

步行(dir, cb, endCb) cb(文件) 零endCb (err |)

module.exports = walk;

function walk(dir, cb, endCb) {
  var fs = require('fs');
  var path = require('path');

  fs.readdir(dir, function(err, files) {
    if (err) {
      return endCb(err);
    }

    var pending = files.length;
    if (pending === 0) {
      endCb(null);
    }
    files.forEach(function(file) {
      fs.stat(path.join(dir, file), function(err, stats) {
        if (err) {
          return endCb(err)
        }

        if (stats.isDirectory()) {
          walk(path.join(dir, file), cb, function() {
            pending--;
            if (pending === 0) {
              endCb(null);
            }
          });
        } else {
          cb(path.join(dir, file));
          pending--;
          if (pending === 0) {
            endCb(null);
          }
        }
      })
    });

  });
}

为了以防有人发现它有用,我还整理了一个同步版本。

var walk = function(dir) {
    var results = [];
    var list = fs.readdirSync(dir);
    list.forEach(function(file) {
        file = dir + '/' + file;
        var stat = fs.statSync(file);
        if (stat && stat.isDirectory()) { 
            /* Recurse into a subdirectory */
            results = results.concat(walk(file));
        } else { 
            /* Is a file */
            results.push(file);
        }
    });
    return results;
}

提示:在筛选时使用更少的资源。这个函数本身的过滤器。例如:替换results.push(文件);下面的代码。根据需要调整:

    file_type = file.split(".").pop();
    file_name = file.split(/(\\|\/)/g).pop();
    if (file_type == "json") results.push(file);

使用bluebird promise.coroutine:

let promise = require('bluebird'),
    PC = promise.coroutine,
    fs = promise.promisifyAll(require('fs'));
let getFiles = PC(function*(dir){
    let files = [];
    let contents = yield fs.readdirAsync(dir);
    for (let i = 0, l = contents.length; i < l; i ++) {
        //to remove dot(hidden) files on MAC
        if (/^\..*/.test(contents[i])) contents.splice(i, 1);
    }
    for (let i = 0, l = contents.length; i < l; i ++) {
        let content = path.resolve(dir, contents[i]);
        let contentStat = yield fs.statAsync(content);
        if (contentStat && contentStat.isDirectory()) {
            let subFiles = yield getFiles(content);
            files = files.concat(subFiles);
        } else {
            files.push(content);
        }
    }
    return files;
});
//how to use
//easy error handling in one place
getFiles(your_dir).then(console.log).catch(err => console.log(err));

There are basically two ways of accomplishing this. In an async environment you'll notice that there are two kinds of loops: serial and parallel. A serial loop waits for one iteration to complete before it moves onto the next iteration - this guarantees that every iteration of the loop completes in order. In a parallel loop, all the iterations are started at the same time, and one may complete before another, however, it is much faster than a serial loop. So in this case, it's probably better to use a parallel loop because it doesn't matter what order the walk completes in, just as long as it completes and returns the results (unless you want them in order).

一个平行循环看起来是这样的:

var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
  var results = [];
  fs.readdir(dir, function(err, list) {
    if (err) return done(err);
    var pending = list.length;
    if (!pending) return done(null, results);
    list.forEach(function(file) {
      file = path.resolve(dir, file);
      fs.stat(file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          walk(file, function(err, res) {
            results = results.concat(res);
            if (!--pending) done(null, results);
          });
        } else {
          results.push(file);
          if (!--pending) done(null, results);
        }
      });
    });
  });
};

一个串行循环看起来像这样:

var fs = require('fs');
var path = require('path');
var walk = function(dir, done) {
  var results = [];
  fs.readdir(dir, function(err, list) {
    if (err) return done(err);
    var i = 0;
    (function next() {
      var file = list[i++];
      if (!file) return done(null, results);
      file = path.resolve(dir, file);
      fs.stat(file, function(err, stat) {
        if (stat && stat.isDirectory()) {
          walk(file, function(err, res) {
            results = results.concat(res);
            next();
          });
        } else {
          results.push(file);
          next();
        }
      });
    })();
  });
};

并且在你的主目录中测试它(警告:如果你的主目录中有很多东西,结果列表将会非常大):

walk(process.env.HOME, function(err, results) {
  if (err) throw err;
  console.log(results);
});

编辑:改进的示例。

a .看一下文件模块。它有一个叫walk的函数:

文件。步行(开始,回调) 导航文件树,为每个目录调用回调,传入 (null, dirPath, dirs, files)。

这可能是为你准备的!是的,它是异步的。但是,如果需要的话,我认为您必须自己聚合完整的路径。

B.另一种选择,甚至是我的最爱之一:使用unix find来查找。为什么要再做一件已经编程好的事情呢?也许不是你真正需要的,但仍然值得一试:

var execFile = require('child_process').execFile;
execFile('find', [ 'somepath/' ], function(err, stdout, stderr) {
  var file_list = stdout.split('\n');
  /* now you've got a list with full path file names */
});

Find有一个很好的内置缓存机制,使得后续搜索非常快,只要只有少数文件夹被更改。