我如何能看到什么是在S3桶与boto3?(例如,写一个“ls”)?

做以下事情:

import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('some/path/')

返回:

s3.Bucket(name='some/path/')

我如何看到它的内容?


当前回答

也可以这样做:

csv_files = s3.list_objects_v2(s3_bucket_path)
    for obj in csv_files['Contents']:
        key = obj['Key']

其他回答

一种更节俭的方法,而不是通过一个for循环来迭代,你也可以只打印原始对象,其中包含S3桶中的所有文件:

session = Session(aws_access_key_id=aws_access_key_id,aws_secret_access_key=aws_secret_access_key)
s3 = session.resource('s3')
bucket = s3.Bucket('bucket_name')

files_in_s3 = bucket.objects.all() 
#you can print this iterable with print(list(files_in_s3))

查看内容的一种方法是:

for my_bucket_object in my_bucket.objects.all():
    print(my_bucket_object)
import boto3
s3 = boto3.resource('s3')

## Bucket to use
my_bucket = s3.Bucket('city-bucket')

## List objects within a given prefix
for obj in my_bucket.objects.filter(Delimiter='/', Prefix='city/'):
  print obj.key

输出:

city/pune.csv
city/goa.csv

我的s3键实用函数本质上是@Hephaestus的答案的优化版本:

import boto3


s3_paginator = boto3.client('s3').get_paginator('list_objects_v2')


def keys(bucket_name, prefix='/', delimiter='/', start_after=''):
    prefix = prefix.lstrip(delimiter)
    start_after = (start_after or prefix) if prefix.endswith(delimiter) else start_after
    for page in s3_paginator.paginate(Bucket=bucket_name, Prefix=prefix, StartAfter=start_after):
        for content in page.get('Contents', ()):
            yield content['Key']

在我的测试(boto3 1.9.84)中,它比等效的(但更简单)代码要快得多:

import boto3


def keys(bucket_name, prefix='/', delimiter='/'):
    prefix = prefix.lstrip(delimiter)
    bucket = boto3.resource('s3').Bucket(bucket_name)
    return (_.key for _ in bucket.objects.filter(Prefix=prefix))

由于S3保证UTF-8二进制排序结果,因此在第一个函数中添加了start_after优化。

也可以这样做:

csv_files = s3.list_objects_v2(s3_bucket_path)
    for obj in csv_files['Contents']:
        key = obj['Key']