使用Wget进行多个同时下载?

我使用wget下载网站内容，但是wget是一个一个下载文件的。

我怎么能让wget下载使用4个同时连接?

当前回答

尝试pcurl

http://sourceforge.net/projects/pcurl/

使用curl代替wget，并行下载10段。

2012-07-16 07:44:24

其他回答

我使用gnu并行

cat listoflinks.txt | parallel --bar -j ${MAX_PARALLEL:-$(nproc)} wget -nv {}

cat会将行分隔的url列表管道到parallel ——bar标志将显示并行执行进度条 MAX_PARALLEL env var是并行下载的最大数量，请谨慎使用，这里默认是当前cpu的数量

提示:使用——dry-run来查看如果执行命令会发生什么。 cat listfllinks .txt | parallel——dry-run——bar -j ${MAX_PARALLEL} wget -nv {}

2021-06-18 12:12:04

尝试pcurl

http://sourceforge.net/projects/pcurl/

使用curl代替wget，并行下载10段。

2012-07-16 07:44:24

如果你在做递归下载，你还不知道所有的url, wget是完美的。

如果您已经有了想要下载的每个URL的列表，那么可以跳过下面的cURL。

使用Wget递归地进行多个同时下载(未知的url列表)

# Multiple simultaneous donwloads

URL=ftp://ftp.example.com

for i in {1..10}; do
    wget --no-clobber --recursive "${URL}" &
done

上面的循环将启动10个wget，每个wget都递归地从同一个网站下载，但是它们不会重叠或下载同一文件两次。

使用——no-clobber可以防止10个wget进程中的每个进程两次下载同一个文件(包括完整的相对URL路径)。

& fork每个wget到后台，允许你运行多个同时下载从同一个网站使用wget。

从url列表中使用curl

如果你已经有了一个想要下载的url列表，curl -Z是并行的curl，默认一次运行50个下载。

然而，对于curl，列表必须是这样的格式:

url = https://example.com/1.html
-O
url = https://example.com/2.html
-O

因此，如果您已经有一个要下载的url列表，只需格式化该列表，然后运行cURL

cat url_list.txt
#https://example.com/1.html
#https://example.com/2.html

touch url_list_formatted.txt

while read -r URL; do
    echo "url = ${URL}" >> url_list_formatted.txt
    echo "-O" >> url_list_formatted.txt
done < url_list.txt

使用curl从url列表中并行下载:

curl -Z --parallel-max 100 -K url_list_formatted.txt

例如,

$ curl -Z --parallel-max 100 -K url_list_formatted.txt
DL% UL%  Dled  Uled  Xfers  Live   Qd Total     Current  Left    Speed
100 --   2512     0     2     0     0  0:00:01  0:00:01 --:--:--  1973

$ ls
1.html  2.html  url_list_formatted.txt  url_list.txt

2022-10-09 18:28:25

您可以使用xargs

-P是进程数，例如设置-P 4，将同时下载4个链接，如果设置-P 0, xargs将启动尽可能多的进程，并下载所有的链接。

cat links.txt | xargs -P 4 -I{} wget {}

2021-03-05 19:05:14

我发现(可能) 一个解决方案

In the process of downloading a few thousand log files from one server to the next I suddenly had the need to do some serious multithreaded downloading in BSD, preferably with Wget as that was the simplest way I could think of handling this. A little looking around led me to this little nugget: wget -r -np -N [url] & wget -r -np -N [url] & wget -r -np -N [url] & wget -r -np -N [url] Just repeat the wget -r -np -N [url] for as many threads as you need... Now given this isn’t pretty and there are surely better ways to do this but if you want something quick and dirty it should do the trick...

注意:选项-N使wget只下载“更新的”文件，这意味着它不会覆盖或重新下载文件，除非它们在服务器上的时间戳发生了变化。

2011-10-04 08:37:45

使用Wget进行多个同时下载?

推荐文章

最新文章

标签