我试图在脚本中从谷歌驱动器下载一个文件,我这样做有点麻烦。我要下载的文件在这里。

我在网上搜了很多,终于下载了其中一个。我得到了文件的uid,较小的文件(1.6MB)下载正常,但较大的文件(3.7GB)总是重定向到一个页面,询问我是否想在不进行病毒扫描的情况下继续下载。谁能帮我跳过那个屏幕?

下面是我如何让第一个文件工作-

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYeDU0VDRFWG9IVUE" > phlat-1.0.tar.gz

当我对另一个文件进行同样操作时,

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYY3h5YlMzTjhnbGM" > index4phlat.tar.gz

我得到以下输出-

我注意到在链接的第三行到最后一行,有一个&confirm=JwkK,这是一个随机的4个字符的字符串,但建议有一种方法添加到我的URL确认。我访问的一个链接建议&confirm=no_antivirus,但这不起作用。

我希望这里有人能帮忙!


当前回答

解决方案只使用谷歌驱动器API

在运行下面的代码之前,您必须激活谷歌驱动器API,安装依赖项并验证您的帐户。说明可以在原来的谷歌驱动器API指南页面上找到

import io
import os
import pickle
import sys, argparse
from googleapiclient.discovery import build
from google.auth.transport.requests import Request
from googleapiclient.http import MediaIoBaseDownload
from google_auth_oauthlib.flow import InstalledAppFlow

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']


def _main(file_id, output):
    """ Shows basic usage of the Drive v3 API.
        Prints the names and ids of the first 10 files the user has access to.
    """
    if not file_id:
        sys.exit('\nMissing arguments. Correct usage:\ndrive_api_download.py --file_id <file_id> [--output output_name]\n')
    elif not output:
        output = "./" + file_id
    
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('drive', 'v3', credentials=creds)

    # Downloads file
    request = service.files().get_media(fileId=file_id)
    fp = open(output, "wb")
    downloader = MediaIoBaseDownload(fp, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk(num_retries=3)
        print("Download %d%%." % int(status.progress() * 100))

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--file_id')
    parser.add_argument('-o', '--output')
    args = parser.parse_args()
    
    _main(args.file_id, args.output)

其他回答

解决方案只使用谷歌驱动器API

在运行下面的代码之前,您必须激活谷歌驱动器API,安装依赖项并验证您的帐户。说明可以在原来的谷歌驱动器API指南页面上找到

import io
import os
import pickle
import sys, argparse
from googleapiclient.discovery import build
from google.auth.transport.requests import Request
from googleapiclient.http import MediaIoBaseDownload
from google_auth_oauthlib.flow import InstalledAppFlow

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']


def _main(file_id, output):
    """ Shows basic usage of the Drive v3 API.
        Prints the names and ids of the first 10 files the user has access to.
    """
    if not file_id:
        sys.exit('\nMissing arguments. Correct usage:\ndrive_api_download.py --file_id <file_id> [--output output_name]\n')
    elif not output:
        output = "./" + file_id
    
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('drive', 'v3', credentials=creds)

    # Downloads file
    request = service.files().get_media(fileId=file_id)
    fp = open(output, "wb")
    downloader = MediaIoBaseDownload(fp, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk(num_retries=3)
        print("Download %d%%." % int(status.progress() * 100))

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--file_id')
    parser.add_argument('-o', '--output')
    args = parser.parse_args()
    
    _main(args.file_id, args.output)

有一个更简单的方法。

从firefox/chrome扩展安装cliget/CURLWGET。

从浏览器下载文件。这将创建一个curl/wget链接,用于记住下载文件时使用的cookie和头文件。使用此命令从任何shell下载

2022年6月

你可以用gdown。也可以考虑访问该页面以获得完整的说明;这只是一个总结,源回购可能有更多最新的说明。


指令

使用以下命令安装:

pip install gdown

在此之后,您可以通过运行以下命令之一从谷歌驱动器下载任何文件:

gdown https://drive.google.com/uc?id=<file_id>  # for files
gdown <file_id>                                 # alternative format
gdown --folder https://drive.google.com/drive/folders/<file_id>  # for folders
gdown --folder --id <file_id>                                   # this format works for folders too

示例:从该目录下载自述文件

gdown https://drive.google.com/uc?id=0B7EVK8r0v71pOXBhSUdJWU1MYUk

file_id应该类似于0Bz8a_Dbh9QhbNU3SGlFaDg。您可以通过右键单击感兴趣的文件并选择Get link来找到这个ID。自2021年11月起,该链接的形式为:

# Files
https://drive.google.com/file/d/<file_id>/view?usp=sharing
# Folders
https://drive.google.com/drive/folders/<file_id>

警告

只对开放文件有效。(“任何有链接的人都可以查看”) 不能下载超过50个文件到一个文件夹。 如果您可以访问源文件,您可以考虑使用tar/zip将其变成一个单独的文件来解决这个限制。

2020年7月- Windows用户批处理文件解决方案

我想为windows用户添加一个简单的批处理文件解决方案,因为我只发现了linux解决方案,我花了几天时间来学习为windows创建解决方案的所有这些东西。因此,为了避免其他人可能需要它,这里是。

你需要的工具

wget for windows (5KB exe小程序,无需安装) 从这里下载。 https://eternallybored.org/misc/wget/ jrepl for windows (117KB的批处理程序,无需安装) 该工具类似于linux的sed工具。 从这里下载: https://www.dostips.com/forum/viewtopic.php?t=6044

假设

%filename% -你想下载的文件将被保存到的文件名。 %fileid% =谷歌文件id(前面已经解释过了)

批量代码下载小文件从谷歌驱动器

wget -O "%filename%" "https://docs.google.com/uc?export=download&id=%fileid%"        

批量代码下载大文件从谷歌驱动器

set cookieFile="cookie.txt"
set confirmFile="confirm.txt"
   
REM downlaod cooky and message with request for confirmation
wget --quiet --save-cookies "%cookieFile%" --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=%fileid%" -O "%confirmFile%"
   
REM extract confirmation key from message saved in confirm file and keep in variable resVar
jrepl ".*confirm=([0-9A-Za-z_]+).*" "$1" /F "%confirmFile%" /A /rtn resVar
   
REM when jrepl writes to variable, it adds carriage return (CR) (0x0D) and a line feed (LF) (0x0A), so remove these two last characters
set confirmKey=%resVar:~0,-2%
   
REM download the file using cookie and confirmation key
wget --load-cookies "%cookieFile%" -O "%filename%" "https://docs.google.com/uc?export=download&id=%fileid%&confirm=%confirmKey%"
   
REM clear temporary files 
del %cookieFile%
del %confirmFile%

下面是我写的一个小bash脚本,它今天完成了这项工作。它适用于大文件,也可以恢复部分获取的文件。它有两个参数,第一个是file_id,第二个是输出文件的名称。与之前的答案相比,主要的改进是它可以在大文件上工作,只需要常用的工具:bash, curl, tr, grep, du, cut和mv。

#!/usr/bin/env bash
fileid="$1"
destination="$2"

# try to download the file
curl -c /tmp/cookie -L -o /tmp/probe.bin "https://drive.google.com/uc?export=download&id=${fileid}"
probeSize=`du -b /tmp/probe.bin | cut -f1`

# did we get a virus message?
# this will be the first line we get when trying to retrive a large file
bigFileSig='<!DOCTYPE html><html><head><title>Google Drive - Virus scan warning</title><meta http-equiv="content-type" content="text/html; charset=utf-8"/>'
sigSize=${#bigFileSig}

if (( probeSize <= sigSize )); then
  virusMessage=false
else
  firstBytes=$(head -c $sigSize /tmp/probe.bin)
  if [ "$firstBytes" = "$bigFileSig" ]; then
    virusMessage=true
  else
    virusMessage=false
  fi
fi

if [ "$virusMessage" = true ] ; then
  confirm=$(tr ';' '\n' </tmp/probe.bin | grep confirm)
  confirm=${confirm:8:4}
  curl -C - -b /tmp/cookie -L -o "$destination" "https://drive.google.com/uc?export=download&id=${fileid}&confirm=${confirm}"
else
  mv /tmp/probe.bin "$destination"
fi