Getting a report of all objects within a bucket¶
Feature overview¶
🚀 Method | bucket_objects_report() |
📄 Description | A method that enables users to get a pandas DataFrame with a friendly report of all objects within a bucket (and optionally within a bucket prefix) |
📦 Acessible from | cloudgeass.aws.s3.S3Client |
Feature demo¶
In this demo, we will get a pandas DataFrame with useful details about all objects within a bucket called datadelivery-sor-data-282495905450-us-east-1
(adding a filter prefix to get only objects inside br_ecommerce/
).
# Importing the client class
from cloudgeass.aws.s3 import S3Client
# Creating a class instance
s3 = S3Client()
# Getting a report of objects within a bucket
df_objects_report = s3.bucket_objects_report(
bucket_name="datadelivery-sor-data-282495905450-us-east-1",
prefix="br_ecommerce"
)
# Checking the result
df_objects_report.head()
Additional comments¶
This method uses a source boto3 s3 client list_objects_v2() with a bucket name and a prefix parameteres to filter the data analyzed.
By using the bucket_objects_report()
method, cloudgeass users can have in hands a pandas DataFrame in a friendly format that enables the extraction of many insights, such as:
- The size of the largest objects in the bucket
- The name and the file format of all objects stored in the given bucket
- The storage class of all objects (optimization opportunities)
- and many others