Wget /curl大文件从谷歌驱动器

我试图在脚本中从谷歌驱动器下载一个文件，我这样做有点麻烦。我要下载的文件在这里。

我在网上搜了很多，终于下载了其中一个。我得到了文件的uid，较小的文件(1.6MB)下载正常，但较大的文件(3.7GB)总是重定向到一个页面，询问我是否想在不进行病毒扫描的情况下继续下载。谁能帮我跳过那个屏幕?

下面是我如何让第一个文件工作-

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYeDU0VDRFWG9IVUE" > phlat-1.0.tar.gz

当我对另一个文件进行同样操作时，

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYY3h5YlMzTjhnbGM" > index4phlat.tar.gz

我得到以下输出-

我注意到在链接的第三行到最后一行，有一个&confirm=JwkK，这是一个随机的4个字符的字符串，但建议有一种方法添加到我的URL确认。我访问的一个链接建议&confirm=no_antivirus，但这不起作用。

我希望这里有人能帮忙!

当前回答

我无法让Nanoix的perl脚本工作，或者我看到的其他curl示例，所以我开始自己用python研究api。这适用于小文件，但大文件阻塞了可用的ram，所以我找到了一些其他不错的分块代码，使用api的部分下载功能。要点: https://gist.github.com/csik/c4c90987224150e4a0b2

注意从API接口下载client_secret json文件到本地目录的部分。

源

$ cat gdrive_dl.py
from pydrive.auth import GoogleAuth  
from pydrive.drive import GoogleDrive    

"""API calls to download a very large google drive file.  The drive API only allows downloading to ram 
   (unlike, say, the Requests library's streaming option) so the files has to be partially downloaded
   and chunked.  Authentication requires a google api key, and a local download of client_secrets.json
   Thanks to Radek for the key functions: http://stackoverflow.com/questions/27617258/memoryerror-how-to-download-large-file-via-google-drive-sdk-using-python
"""

def partial(total_byte_len, part_size_limit):
    s = []
    for p in range(0, total_byte_len, part_size_limit):
        last = min(total_byte_len - 1, p + part_size_limit - 1)
        s.append([p, last])
    return s

def GD_download_file(service, file_id):
  drive_file = service.files().get(fileId=file_id).execute()
  download_url = drive_file.get('downloadUrl')
  total_size = int(drive_file.get('fileSize'))
  s = partial(total_size, 100000000) # I'm downloading BIG files, so 100M chunk size is fine for me
  title = drive_file.get('title')
  originalFilename = drive_file.get('originalFilename')
  filename = './' + originalFilename
  if download_url:
      with open(filename, 'wb') as file:
        print "Bytes downloaded: "
        for bytes in s:
          headers = {"Range" : 'bytes=%s-%s' % (bytes[0], bytes[1])}
          resp, content = service._http.request(download_url, headers=headers)
          if resp.status == 206 :
                file.write(content)
                file.flush()
          else:
            print 'An error occurred: %s' % resp
            return None
          print str(bytes[1])+"..."
      return title, filename
  else:
    return None          


gauth = GoogleAuth()
gauth.CommandLineAuth() #requires cut and paste from a browser 

FILE_ID = 'SOMEID' #FileID is the simple file hash, like 0B1NzlxZ5RpdKS0NOS0x0Ym9kR0U

drive = GoogleDrive(gauth)
service = gauth.service
#file = drive.CreateFile({'id':FILE_ID})    # Use this to get file metadata
GD_download_file(service, FILE_ID)

2015-01-02 22:39:25

其他回答

最简单的方法就是把你想下载的东西放在一个文件夹里。共享该文件夹，然后从URL栏中抓取文件夹ID。

然后进入https://googledrive.com/host/[ID](用你的文件夹ID替换ID) 您应该看到该文件夹中所有文件的列表，单击要下载的文件。一个下载应该然后访问你的下载页面(Ctrl+J在Chrome上)，然后你想复制下载链接，然后使用 Wget“下载链接”

享受:)

2016-09-06 12:41:06

根据Roshan Sethia的回答

2018年5月

使用WGET:

Create a shell script called wgetgdrive.sh as below: #!/bin/bash # Get files from Google Drive # $1 = file ID # $2 = file name URL="https://docs.google.com/uc?export=download&id=$1" wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate $URL -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=$1" -O $2 && rm -rf /tmp/cookies.txt Give the right permissions to execute the script In terminal, run: ./wgetgdrive.sh <file ID> <filename> for example: ./wgetgdrive.sh 1lsDPURlTNzS62xEOAIG98gsaW6x2PYd2 images.zip

2018-05-28 21:11:41

以上答案对于2020年4月已经过时，因为谷歌驱动器现在使用重定向到文件的实际位置。

截至2020年4月，在macOS 10.15.4上工作的公共文档:

# this is used for drive directly downloads
function download-google(){
  echo "https://drive.google.com/uc?export=download&id=$1"
  mkdir -p .tmp
  curl -c .tmp/$1cookies "https://drive.google.com/uc?export=download&id=$1" > .tmp/$1intermezzo.html;
  curl -L -b .tmp/$1cookies "$(egrep -o "https.+download" .tmp/$1intermezzo.html)" > $2;
}

# some files are shared using an indirect download
function download-google-2(){
  echo "https://drive.google.com/uc?export=download&id=$1"
  mkdir -p .tmp
  curl -c .tmp/$1cookies "https://drive.google.com/uc?export=download&id=$1" > .tmp/$1intermezzo.html;
  code=$(egrep -o "confirm=(.+)&amp;id=" .tmp/$1intermezzo.html | cut -d"=" -f2 | cut -d"&" -f1)
  curl -L -b .tmp/$1cookies "https://drive.google.com/uc?export=download&confirm=$code&id=$1" > $2;
}

# used like this
download-google <id> <name of item.extension>

2020-04-14 05:35:53

获得可共享的链接，并以隐身方式打开它(非常重要)。它会说它无法扫描。

打开检查器并跟踪网络流量。点击“无论如何下载”按钮。

复制最后一个请求的url。这是你的链接。在wget中使用它。

2018-12-06 00:17:31

截至2018年3月更新。

我尝试了其他答案中给出的各种技术，直接从谷歌驱动器下载我的文件(6 GB)到我的AWS ec2实例，但没有一个有效(可能是因为它们太旧了)。

所以，为了让其他人知道，下面是我是如何成功做到的:

Right-click on the file you want to download, click share, under link sharing section, select "anyone with this link can edit". Copy the link. It should be in this format: https://drive.google.com/file/d/FILEIDENTIFIER/view?usp=sharing Copy the FILEIDENTIFIER portion from the link. Copy the below script to a file. It uses curl and processes the cookie to automate the downloading of the file. #!/bin/bash fileid="FILEIDENTIFIER" filename="FILENAME" curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename} As shown above, paste the FILEIDENTIFIER in the script. Remember to keep the double quotes! Provide a name for the file in place of FILENAME. Remember to keep the double quotes and also include the extension in FILENAME (for example, myfile.zip). Now, save the file and make the file executable by running this command in terminal sudo chmod +x download-gdrive.sh. Run the script using `./download-gdrive.sh".

PS:这是上面给出的脚本的Github要点:https://gist.github.com/amit-chahar/db49ce64f46367325293e4cce13d2424

2018-03-23 07:58:03

Wget /curl大文件从谷歌驱动器

推荐文章

最新文章

标签