我如何能看到什么是在S3桶与boto3?(例如,写一个“ls”)?

做以下事情:

import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('some/path/')

返回:

s3.Bucket(name='some/path/')

我如何看到它的内容?


当前回答

这是解决方案

import boto3

s3=boto3.resource('s3')
BUCKET_NAME = 'Your S3 Bucket Name'
allFiles = s3.Bucket(BUCKET_NAME).objects.all()
for file in allFiles:
    print(file.key)

其他回答

我只是这样做的,包括身份验证方法:

s3_client = boto3.client(
                's3',
                aws_access_key_id='access_key',
                aws_secret_access_key='access_key_secret',
                config=boto3.session.Config(signature_version='s3v4'),
                region_name='region'
            )

response = s3_client.list_objects(Bucket='bucket_name', Prefix=key)
if ('Contents' in response):
    # Object / key exists!
    return True
else:
    # Object / key DOES NOT exist!
    return False

查看内容的一种方法是:

for my_bucket_object in my_bucket.objects.all():
    print(my_bucket_object)

我以前是这样做的:

import boto3
s3 = boto3.resource('s3')
bucket=s3.Bucket("bucket_name")
contents = [_.key for _ in bucket.objects.all() if "subfolders/ifany/" in _.key]

我的s3键实用函数本质上是@Hephaestus的答案的优化版本:

import boto3


s3_paginator = boto3.client('s3').get_paginator('list_objects_v2')


def keys(bucket_name, prefix='/', delimiter='/', start_after=''):
    prefix = prefix.lstrip(delimiter)
    start_after = (start_after or prefix) if prefix.endswith(delimiter) else start_after
    for page in s3_paginator.paginate(Bucket=bucket_name, Prefix=prefix, StartAfter=start_after):
        for content in page.get('Contents', ()):
            yield content['Key']

在我的测试(boto3 1.9.84)中,它比等效的(但更简单)代码要快得多:

import boto3


def keys(bucket_name, prefix='/', delimiter='/'):
    prefix = prefix.lstrip(delimiter)
    bucket = boto3.resource('s3').Bucket(bucket_name)
    return (_.key for _ in bucket.objects.filter(Prefix=prefix))

由于S3保证UTF-8二进制排序结果,因此在第一个函数中添加了start_after优化。

import boto3
s3 = boto3.resource('s3')

## Bucket to use
my_bucket = s3.Bucket('city-bucket')

## List objects within a given prefix
for obj in my_bucket.objects.filter(Delimiter='/', Prefix='city/'):
  print obj.key

输出:

city/pune.csv
city/goa.csv