如何使用Node.js下载文件(不使用第三方库)?

如何在不使用第三方库的情况下使用Node.js下载文件?

我不需要什么特别的东西。我只想从给定的URL下载文件，然后将其保存到给定的目录。

当前回答

超时解决方案，防止内存泄漏:

下面的代码是基于Brandon Tilley的回答:

var http = require('http'),
    fs = require('fs');

var request = http.get("http://example12345.com/yourfile.html", function(response) {
    if (response.statusCode === 200) {
        var file = fs.createWriteStream("copy.html");
        response.pipe(file);
    }
    // Add timeout.
    request.setTimeout(12000, function () {
        request.abort();
    });
});

当您得到一个错误时，不要创建文件，并倾向于使用超时在X秒后关闭您的请求。

2014-10-07 09:49:22

其他回答

使用http2模块

我看到了使用http、https和request模块的答案。我想添加一个使用另一个本地NodeJS模块，支持http或https协议:

解决方案

我已经参考了官方的NodeJS API，以及关于这个问题的一些其他答案。下面是我编写的测试，它可以按照预期工作:

import * as fs from 'fs';
import * as _path from 'path';
import * as http2 from 'http2';

/* ... */

async function download( host, query, destination )
{
    return new Promise
    (
        ( resolve, reject ) =>
        {
            // Connect to client:
            const client = http2.connect( host );
            client.on( 'error', error => reject( error ) );

            // Prepare a write stream:
            const fullPath = _path.join( fs.realPathSync( '.' ), destination );
            const file = fs.createWriteStream( fullPath, { flags: "wx" } );
            file.on( 'error', error => reject( error ) );

            // Create a request:
            const request = client.request( { [':path']: query } );

            // On initial response handle non-success (!== 200) status error:
            request.on
            (
                'response',
                ( headers/*, flags*/ ) =>
                {
                    if( headers[':status'] !== 200 )
                    {
                        file.close();
                        fs.unlink( fullPath, () => {} );
                        reject( new Error( `Server responded with ${headers[':status']}` ) );
                    }
                }
            );

            // Set encoding for the payload:
            request.setEncoding( 'utf8' );

            // Write the payload to file:
            request.on( 'data', chunk => file.write( chunk ) );

            // Handle ending the request
            request.on
            (
                'end',
                () =>
                {
                    file.close();
                    client.close();
                    resolve( { result: true } );
                }
            );

            /* 
                You can use request.setTimeout( 12000, () => {} ) for aborting
                after period of inactivity
            */

            // Fire off [flush] the request:
            request.end();
        }
    );
}

然后，例如:

/* ... */

let downloaded = await download( 'https://gitlab.com', '/api/v4/...', 'tmp/tmpFile' );

if( downloaded.result )
{
    // Success!
}

// ...

外部引用

https://nodejs.org/api/http2.html#http2_client_side_example https://nodejs.org/api/http2.html#http2_clienthttp2session_request_headers_options

编辑信息

解决方案是为typescript编写的，函数是一个类方法——但是没有注意到这一点，如果没有正确使用函数声明，这个解决方案将无法为假定的javascript用户工作，这是我们的贡献者迅速添加的。谢谢!

2020-12-06 14:52:49

编写自己的解决方案，因为现有的不符合我的要求。

包括:

HTTPS下载(http下载时切换包到http) 基于承诺的函数处理转发路径(状态302) 浏览器头-需要在一些cdn 来自URL的文件名(以及硬编码) 错误处理

打印出来的，更安全。如果你使用的是纯JS(没有Flow，没有TS)，可以随意删除类型，或者转换为.d。ts文件

index.js

import httpsDownload from httpsDownload;
httpsDownload('https://example.com/file.zip', './');

httpsDownload.[js|ts]

import https from "https";
import fs from "fs";
import path from "path";

function download(
  url: string,
  folder?: string,
  filename?: string
): Promise<void> {
  return new Promise((resolve, reject) => {
    const req = https
      .request(url, { headers: { "User-Agent": "javascript" } }, (response) => {
        if (response.statusCode === 302 && response.headers.location != null) {
          download(
            buildNextUrl(url, response.headers.location),
            folder,
            filename
          )
            .then(resolve)
            .catch(reject);
          return;
        }

        const file = fs.createWriteStream(
          buildDestinationPath(url, folder, filename)
        );
        response.pipe(file);
        file.on("finish", () => {
          file.close();
          resolve();
        });
      })
      .on("error", reject);
    req.end();
  });
}

function buildNextUrl(current: string, next: string) {
  const isNextUrlAbsolute = RegExp("^(?:[a-z]+:)?//").test(next);
  if (isNextUrlAbsolute) {
    return next;
  } else {
    const currentURL = new URL(current);
    const fullHost = `${currentURL.protocol}//${currentURL.hostname}${
      currentURL.port ? ":" + currentURL.port : ""
    }`;
    return `${fullHost}${next}`;
  }
}

function buildDestinationPath(url: string, folder?: string, filename?: string) {
  return path.join(folder ?? "./", filename ?? generateFilenameFromPath(url));
}

function generateFilenameFromPath(url: string): string {
  const urlParts = url.split("/");
  return urlParts[urlParts.length - 1] ?? "";
}

export default download;

2020-08-03 17:21:25

2022年底编辑:

Node v18及以上版本自带自带的Fetch API支持。使用它。

最初的回答:

对于支持承诺的节点，与其他答案相比，一个简单的(部分)Fetch API的Node shim只需要少量额外的代码:

const fs = require(`fs`);
const http = require(`http`);
const https = require(`https`);

module.exports = function fetch(url) {
  return new Promise((resolve, reject) => {
    const data = [];
    const client = url.startsWith("https") ? https : http;
    client
      .request(url, (res) => {
        res.on(`data`, (chunk) => data.push(chunk));
        res.on(`end`, () => {
          const asBytes = Buffer.concat(data);
          const asString = asBytes.toString(`utf8`);
          resolve({
            arrayBuffer: async () => asBytes,
            json: async () => JSON.parse(asString),
            text: async () => asString,
          });
        });
        res.on(`error`, (e) => reject(e));
      })
      .end();
  });
};

你可以用它来做任何你需要的事情，使用普通的fetch语法:

const fetch = require(`./tiny-fetch.js`);

fetch(`https://placekitten.com/200/300`)
  .then(res => res.arrayBuffer())
  .then(bytes => fs.writeFileSync(`kitten.jpg`, bytes))
  .catch(e => console.error(e));

fetch(`https://jsonplaceholder.typicode.com/todos/1`)
  .then(res => res.json())
  .then(obj => console.log(obj))
  .catch(e => console.error(e));

// etc.

2022-09-04 19:50:12

Gfxmonk的答案在回调和file.close()完成之间有一个非常紧张的数据竞赛。File.close()实际上接受一个回调函数，该函数在close完成时被调用。否则，立即使用文件可能会失败(非常罕见!)。

一个完整的解决方案是:

var http = require('http');
var fs = require('fs');

var download = function(url, dest, cb) {
  var file = fs.createWriteStream(dest);
  var request = http.get(url, function(response) {
    response.pipe(file);
    file.on('finish', function() {
      file.close(cb);  // close() is async, call cb after close completes.
    });
  });
}

如果不等待finish事件，幼稚的脚本可能最终得到一个不完整的文件。如果不通过close调度cb回调，您可能会在访问文件和文件实际准备就绪之间出现竞争。

2014-04-01 18:11:58

就像Michelle Tilley说的，但是要有适当的控制流:

var http = require('http');
var fs = require('fs');

var download = function(url, dest, cb) {
  var file = fs.createWriteStream(dest);
  http.get(url, function(response) {
    response.pipe(file);
    file.on('finish', function() {
      file.close(cb);
    });
  });
}

如果不等待finish事件，幼稚的脚本可能最终得到一个不完整的文件。

编辑:感谢@Augusto Roman指出cb应该传递到文件。Close，没有显式调用。

2013-07-16 12:40:54

如何使用Node.js下载文件(不使用第三方库)?

推荐文章

最新文章

标签