我试图在脚本中从谷歌驱动器下载一个文件,我这样做有点麻烦。我要下载的文件在这里。

我在网上搜了很多,终于下载了其中一个。我得到了文件的uid,较小的文件(1.6MB)下载正常,但较大的文件(3.7GB)总是重定向到一个页面,询问我是否想在不进行病毒扫描的情况下继续下载。谁能帮我跳过那个屏幕?

下面是我如何让第一个文件工作-

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYeDU0VDRFWG9IVUE" > phlat-1.0.tar.gz

当我对另一个文件进行同样操作时,

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYY3h5YlMzTjhnbGM" > index4phlat.tar.gz

我得到以下输出-

我注意到在链接的第三行到最后一行,有一个&confirm=JwkK,这是一个随机的4个字符的字符串,但建议有一种方法添加到我的URL确认。我访问的一个链接建议&confirm=no_antivirus,但这不起作用。

我希望这里有人能帮忙!


当前回答

这是我从谷歌驱动器下载文件到我的谷歌云Linux外壳的解决方案。

使用高级共享将文件共享给PUBLIC并具有Edit权限。 你会得到一个共享链接,它会有一个ID。参见链接:- drive.google.com/file/d/ (ID) /视图? usp =分享 复制该ID并粘贴在以下链接:-

googledrive.com/host/ (ID)

以上链接为我们的下载链接。 使用wget下载文件:-

wget https://googledrive.com/host/ [ID]

该命令将下载名称为[ID]的文件,没有扩展名,但文件大小与运行wget命令的位置相同。 实际上,我在实习时下载了一个压缩文件夹。所以我重命名了这个尴尬的文件使用:-

mv [id] 1.zip

然后使用

1.压缩解压缩

我们会拿到文件的。

其他回答

解决方案只使用谷歌驱动器API

在运行下面的代码之前,您必须激活谷歌驱动器API,安装依赖项并验证您的帐户。说明可以在原来的谷歌驱动器API指南页面上找到

import io
import os
import pickle
import sys, argparse
from googleapiclient.discovery import build
from google.auth.transport.requests import Request
from googleapiclient.http import MediaIoBaseDownload
from google_auth_oauthlib.flow import InstalledAppFlow

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']


def _main(file_id, output):
    """ Shows basic usage of the Drive v3 API.
        Prints the names and ids of the first 10 files the user has access to.
    """
    if not file_id:
        sys.exit('\nMissing arguments. Correct usage:\ndrive_api_download.py --file_id <file_id> [--output output_name]\n')
    elif not output:
        output = "./" + file_id
    
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('drive', 'v3', credentials=creds)

    # Downloads file
    request = service.files().get_media(fileId=file_id)
    fp = open(output, "wb")
    downloader = MediaIoBaseDownload(fp, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk(num_retries=3)
        print("Download %d%%." % int(status.progress() * 100))

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--file_id')
    parser.add_argument('-o', '--output')
    args = parser.parse_args()
    
    _main(args.file_id, args.output)

下面是我写的一个小bash脚本,它今天完成了这项工作。它适用于大文件,也可以恢复部分获取的文件。它有两个参数,第一个是file_id,第二个是输出文件的名称。与之前的答案相比,主要的改进是它可以在大文件上工作,只需要常用的工具:bash, curl, tr, grep, du, cut和mv。

#!/usr/bin/env bash
fileid="$1"
destination="$2"

# try to download the file
curl -c /tmp/cookie -L -o /tmp/probe.bin "https://drive.google.com/uc?export=download&id=${fileid}"
probeSize=`du -b /tmp/probe.bin | cut -f1`

# did we get a virus message?
# this will be the first line we get when trying to retrive a large file
bigFileSig='<!DOCTYPE html><html><head><title>Google Drive - Virus scan warning</title><meta http-equiv="content-type" content="text/html; charset=utf-8"/>'
sigSize=${#bigFileSig}

if (( probeSize <= sigSize )); then
  virusMessage=false
else
  firstBytes=$(head -c $sigSize /tmp/probe.bin)
  if [ "$firstBytes" = "$bigFileSig" ]; then
    virusMessage=true
  else
    virusMessage=false
  fi
fi

if [ "$virusMessage" = true ] ; then
  confirm=$(tr ';' '\n' </tmp/probe.bin | grep confirm)
  confirm=${confirm:8:4}
  curl -C - -b /tmp/cookie -L -o "$destination" "https://drive.google.com/uc?export=download&id=${fileid}&confirm=${confirm}"
else
  mv /tmp/probe.bin "$destination"
fi

2020年7月- Windows用户批处理文件解决方案

我想为windows用户添加一个简单的批处理文件解决方案,因为我只发现了linux解决方案,我花了几天时间来学习为windows创建解决方案的所有这些东西。因此,为了避免其他人可能需要它,这里是。

你需要的工具

wget for windows (5KB exe小程序,无需安装) 从这里下载。 https://eternallybored.org/misc/wget/ jrepl for windows (117KB的批处理程序,无需安装) 该工具类似于linux的sed工具。 从这里下载: https://www.dostips.com/forum/viewtopic.php?t=6044

假设

%filename% -你想下载的文件将被保存到的文件名。 %fileid% =谷歌文件id(前面已经解释过了)

批量代码下载小文件从谷歌驱动器

wget -O "%filename%" "https://docs.google.com/uc?export=download&id=%fileid%"        

批量代码下载大文件从谷歌驱动器

set cookieFile="cookie.txt"
set confirmFile="confirm.txt"
   
REM downlaod cooky and message with request for confirmation
wget --quiet --save-cookies "%cookieFile%" --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=%fileid%" -O "%confirmFile%"
   
REM extract confirmation key from message saved in confirm file and keep in variable resVar
jrepl ".*confirm=([0-9A-Za-z_]+).*" "$1" /F "%confirmFile%" /A /rtn resVar
   
REM when jrepl writes to variable, it adds carriage return (CR) (0x0D) and a line feed (LF) (0x0A), so remove these two last characters
set confirmKey=%resVar:~0,-2%
   
REM download the file using cookie and confirmation key
wget --load-cookies "%cookieFile%" -O "%filename%" "https://docs.google.com/uc?export=download&id=%fileid%&confirm=%confirmKey%"
   
REM clear temporary files 
del %cookieFile%
del %confirmFile%

警告:此功能已弃用。见下面评论中的警告。


看看这个问题:直接从谷歌驱动器使用谷歌驱动器API下载

基本上,你必须创建一个公共目录,并通过相对引用来访问你的文件

wget https://googledrive.com/host/LARGEPUBLICFOLDERID/index4phlat.tar.gz

或者,您可以使用这个脚本:https://github.com/circulosmeos/gdown.pl

我无法让Nanoix的perl脚本工作,或者我看到的其他curl示例,所以我开始自己用python研究api。这适用于小文件,但大文件阻塞了可用的ram,所以我找到了一些其他不错的分块代码,使用api的部分下载功能。要点: https://gist.github.com/csik/c4c90987224150e4a0b2

注意从API接口下载client_secret json文件到本地目录的部分。

$ cat gdrive_dl.py
from pydrive.auth import GoogleAuth  
from pydrive.drive import GoogleDrive    

"""API calls to download a very large google drive file.  The drive API only allows downloading to ram 
   (unlike, say, the Requests library's streaming option) so the files has to be partially downloaded
   and chunked.  Authentication requires a google api key, and a local download of client_secrets.json
   Thanks to Radek for the key functions: http://stackoverflow.com/questions/27617258/memoryerror-how-to-download-large-file-via-google-drive-sdk-using-python
"""

def partial(total_byte_len, part_size_limit):
    s = []
    for p in range(0, total_byte_len, part_size_limit):
        last = min(total_byte_len - 1, p + part_size_limit - 1)
        s.append([p, last])
    return s

def GD_download_file(service, file_id):
  drive_file = service.files().get(fileId=file_id).execute()
  download_url = drive_file.get('downloadUrl')
  total_size = int(drive_file.get('fileSize'))
  s = partial(total_size, 100000000) # I'm downloading BIG files, so 100M chunk size is fine for me
  title = drive_file.get('title')
  originalFilename = drive_file.get('originalFilename')
  filename = './' + originalFilename
  if download_url:
      with open(filename, 'wb') as file:
        print "Bytes downloaded: "
        for bytes in s:
          headers = {"Range" : 'bytes=%s-%s' % (bytes[0], bytes[1])}
          resp, content = service._http.request(download_url, headers=headers)
          if resp.status == 206 :
                file.write(content)
                file.flush()
          else:
            print 'An error occurred: %s' % resp
            return None
          print str(bytes[1])+"..."
      return title, filename
  else:
    return None          


gauth = GoogleAuth()
gauth.CommandLineAuth() #requires cut and paste from a browser 

FILE_ID = 'SOMEID' #FileID is the simple file hash, like 0B1NzlxZ5RpdKS0NOS0x0Ym9kR0U

drive = GoogleDrive(gauth)
service = gauth.service
#file = drive.CreateFile({'id':FILE_ID})    # Use this to get file metadata
GD_download_file(service, FILE_ID)