我如何能看到什么是在S3桶与boto3?(例如,写一个“ls”)?

做以下事情:

import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('some/path/')

返回:

s3.Bucket(name='some/path/')

我如何看到它的内容?


当前回答

我以前是这样做的:

import boto3
s3 = boto3.resource('s3')
bucket=s3.Bucket("bucket_name")
contents = [_.key for _ in bucket.objects.all() if "subfolders/ifany/" in _.key]

其他回答

在上面的注释中对@Hephaeastus的代码进行了少许修改,编写了下面的方法来列出给定路径中的文件夹和对象(文件)。类似s3 ls命令。

from boto3 import session

def s3_ls(profile=None, bucket_name=None, folder_path=None):
    folders=[]
    files=[]
    result=dict()
    bucket_name = bucket_name
    prefix= folder_path
    session = boto3.Session(profile_name=profile)
    s3_conn   = session.client('s3')
    s3_result =  s3_conn.list_objects_v2(Bucket=bucket_name, Delimiter = "/", Prefix=prefix)
    if 'Contents' not in s3_result and 'CommonPrefixes' not in s3_result:
        return []

    if s3_result.get('CommonPrefixes'):
        for folder in s3_result['CommonPrefixes']:
            folders.append(folder.get('Prefix'))

    if s3_result.get('Contents'):
        for key in s3_result['Contents']:
            files.append(key['Key'])

    while s3_result['IsTruncated']:
        continuation_key = s3_result['NextContinuationToken']
        s3_result = s3_conn.list_objects_v2(Bucket=bucket_name, Delimiter="/", ContinuationToken=continuation_key, Prefix=prefix)
        if s3_result.get('CommonPrefixes'):
            for folder in s3_result['CommonPrefixes']:
                folders.append(folder.get('Prefix'))
        if s3_result.get('Contents'):
            for key in s3_result['Contents']:
                files.append(key['Key'])

    if folders:
        result['folders']=sorted(folders)
    if files:
        result['files']=sorted(files)
    return result

这将列出给定路径下的所有对象/文件夹。Folder_path可以默认为None, method将列出桶根目录的即时内容。

也可以这样做:

csv_files = s3.list_objects_v2(s3_bucket_path)
    for obj in csv_files['Contents']:
        key = obj['Key']

如果你想传递ACCESS和SECRET密钥(你不应该这样做,因为这是不安全的):

from boto3.session import Session

ACCESS_KEY='your_access_key'
SECRET_KEY='your_secret_key'

session = Session(aws_access_key_id=ACCESS_KEY,
                  aws_secret_access_key=SECRET_KEY)
s3 = session.resource('s3')
your_bucket = s3.Bucket('your_bucket')

for s3_file in your_bucket.objects.all():
    print(s3_file.key)

为了处理大型键列表(即当目录列表大于1000项时),我使用以下代码将多个列表中的键值(即文件名)累积起来(感谢上面的阿梅里奥的第一行)。代码是针对python3的:

    from boto3  import client
    bucket_name = "my_bucket"
    prefix      = "my_key/sub_key/lots_o_files"

    s3_conn   = client('s3')  # type: BaseClient  ## again assumes boto.cfg setup, assume AWS S3
    s3_result =  s3_conn.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter = "/")

    if 'Contents' not in s3_result:
        #print(s3_result)
        return []

    file_list = []
    for key in s3_result['Contents']:
        file_list.append(key['Key'])
    print(f"List count = {len(file_list)}")

    while s3_result['IsTruncated']:
        continuation_key = s3_result['NextContinuationToken']
        s3_result = s3_conn.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter="/", ContinuationToken=continuation_key)
        for key in s3_result['Contents']:
            file_list.append(key['Key'])
        print(f"List count = {len(file_list)}")
    return file_list

ObjectSummary:

有两个标识符附加到ObjectSummary:

bucket_name 关键

boto3 S3: ObjectSummary

有关AWS S3文档中的对象键的更多信息:

Object Keys: When you create an object, you specify the key name, which uniquely identifies the object in the bucket. For example, in the Amazon S3 console (see AWS Management Console), when you highlight a bucket, a list of objects in your bucket appears. These names are the object keys. The name for a key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long. The Amazon S3 data model is a flat structure: you create a bucket, and the bucket stores objects. There is no hierarchy of subbuckets or subfolders; however, you can infer logical hierarchy using key name prefixes and delimiters as the Amazon S3 console does. The Amazon S3 console supports a concept of folders. Suppose that your bucket (admin-created) has four objects with the following object keys: Development/Projects1.xls Finance/statement1.pdf Private/taxdocument.pdf s3-dg.pdf Reference: AWS S3: Object Keys

下面是一些示例代码,演示如何获取桶名和对象键。

例子:

import boto3
from pprint import pprint

def main():

    def enumerate_s3():
        s3 = boto3.resource('s3')
        for bucket in s3.buckets.all():
             print("Name: {}".format(bucket.name))
             print("Creation Date: {}".format(bucket.creation_date))
             for object in bucket.objects.all():
                 print("Object: {}".format(object))
                 print("Object bucket_name: {}".format(object.bucket_name))
                 print("Object key: {}".format(object.key))

    enumerate_s3()


if __name__ == '__main__':
    main()