This article data science blogthon.
prologue
AWS S3 is one for object files quickly and securely from anywhere. Users can combine S3 with other services to build a number of luding S3 and consume resources from within AWS. It helps developers create, configure, and manage her AWS services, making it easy to integrate with Python applications, libraries, or scripts. This article explains how boto3 works and how it helps interact with his S3 operations such as creating, listing, and deleting buckets and objects.
What is boto3
Boto3 is a Python SDK or library that allows you to manage and access various AWS services such as Amazon S3, EC2, Dynamo DB, SQS, and Cloudwatch via Python scripts. Boto3 has a data-driven approach to generating classes at runtime from JSON description files shared between SDKs. Boto 3 is generated from these shared JSON files, so users get fast updates to the latest services and consistent APIs across services. It provides an object-oriented, easy-to-use API and direct low-level service access.
Main features of boto3
- It’s built on top of botocore, a Python library used to send API requests to AWS and receive responses from services.
- Natively supports Python 2.7+ and 3.4+.
- Boto3 provides session and per-session credentials and configuration along with important components such as authentication, parameters and response handling.
- Has a consistent and modern interface
Using AWS S3 and Boto3
Source: https://dashbird.io/blog/boto3-aws-python/
The Boto3 library or SDK on Amazon S3 allows users to more quickly create, delete, and update S3 buckets, objects, S3 bucket policies, etc. from Python programs or scripts. Boto3 has two abstractions: clients and resources. A user can choose tow when working with a single S3 file, or resource abstraction when working with multiple her S3 buckets. A client provides a low-level interface to his AWS services, while a resource is a higher-level abstraction than the client.
Install boto3 and build AWS S3 client
Install boto3 in your application:
in the terminal, use the code
pip list
The above code will list the installed packages. If Boto3 is not installed, install it with the following code.
pip install boto3
Build an S3 client that accesses the service methods.
Create an S3 client that helps you access objects stored in your S3 environment and set credentials such as aws_access_key_id and aws_secret_access_key. To access your S3 bucket and run the following code, you need credentials such as your access key and secret key.
# Import the necessary packages import boto3 # Now, build a client S3 = boto3.client( 's3', aws_access_key_id = 'enter your_aws_access_key_id ', aws_secret_access_key = ' enter your_aws_secret_access_key ', region_name=" enter your_aws_region_name " )
AWS S3 operations using boto3
Create Bucket:
To create an S3 bucket, use the create_bucket() method with the Bucket and ACL parameters. ACLs represent Access Control Lists that manage access to S3 buckets and objects. Note that bucket names must be unique across AWS platforms.
my_bucket = "enter your s3 bucket name that has to be created" bucket = s3.create_bucket( ACL='private', Bucket= my_bucket )
List of buckets:
Use the list_buckets() method to list all available buckets.
bucket_response = s3.list_buckets() # Output the bucket names print('Existing buckets are:') for bucket in bucket_response ['Buckets']: print(f' {bucket["Name"]}')
Delete Bucket:
Buckets in S3 can be deleted using the delete_bucket() method. The bucket should be empty. That is, it does not contain an object to perform the delete.
my_bucket = "enter your s3 bucket name that has to be deleted" response = s3.delete_bucket(Bucket= my_bucket) print("Bucket has been deleted successfully !!!")
List files from a bucket:
Files or objects in an S3 bucket can be listed using the list_objects or list_objects_v2 methods.
my_bucket = "enter your s3 bucket name from which objects or files has to be listed out" response = s3.list_objects(Bucket= my_bucket, MaxKeys=10, Preffix="only_files_starting_with_this_string")
The MaxKeys argument represents the maximum number of objects to list. The prefix argument lists objects whose keys (names) start with a specific prefix only.
Another way to list objects:
s3 = boto3.client("s3") my_bucket = " enter your s3 bucket name from which objects or files has to be listed out " response = s3.list_objects_v2(Bucket=my_bucket) files = response.get("Contents") for file in files: print(f"file_name: {file['Key']}, size: {file['Size']}")
File upload:
To upload a file to an s3 bucket, use the method upload_file () with the following parameters.
- File: defines the path of the file to upload
- Key: Represents a unique identifier for an object within a bucket
- Bucket: Bucket name where the file needs to be uploaded
my_bucket = "enter your bucket name to which files has to be uploaded" file_name = "enter your file path name to be uploaded" key_name = "enter unique identifier" s3.upload_file(Filename= file_name, Bucket= my_bucket, Key= key_name)
File download:
To download a file or object from a bucket locally, use the download_file() method with key, Bucket, and Filename parameters.
my_bucket = "enter your s3 bucket name from which object or files has to be downloaded" file_name = "enter file to be downloaded" key_name = "enter unique identifier" s3.download_file(Filename= file_name, Bucket= my_bucket, Key= key_name)
File deletion:
To delete a file or object from a bucket, use the delete_object() method with key and Bucket parameters.
my_bucket = "enter your s3 bucket name from which objects or files has to be deleted" key_name = "enter unique identifier" s3.delete_object(Bucket= my_bucket, Key= key_name)
Get the metadata of an object.
To get file or object details such as last modification time, storage class, content length, size in bytes, etc., use the head_object() method with the key and Bucket parameters.
my_bucket = "enter your s3 bucket name from which objects or file's metadata has to be obtained" key_name = "enter unique identifier" response = s3.head_object(Bucket= my_bucket, Key= key_name)
Conclusion
AWS S3 is one of the most reliable, flexible and durable object storage systems for users to store and retrieve data. AWS defines boto3 as a Python library or SDK (Software Development Kit) to create, manage and configure AWS services including S3. boto3 manipulates his AWS services programmatically from applications and services.
Important points:
- AWS S3 is one object storage service that helps you store and retrieve files quickly.
- Boto3 is a Python SDK or library that can manage Amazon S3, EC2, Dynamo DB, SQS, Cloudwatch, and more.
- The Boto3 client provides a low-level interface to AWS services, but resources are a higher level abstraction than the client.
- The Boto3 library in Amazon S3 allows users to more quickly create, list, delete, and update S3 buckets, objects, S3 bucket policies, etc. from Python programs or scripts.
Media shown in this article are not owned by Analytics Vidhya and are used at the author’s discretion.