DEV Community

Renaldi for AWS Community Builders

Posted on

Leveraging Amazon SageMaker for Image Classification: A Hands-On Guide with CIFAR-10 Dataset

Our friends, the Artificial Intelligence

Motivation

Today, we are diving deep into an exciting hands-on project where we will be using Amazon SageMaker to train and deploy an image classification model with the CIFAR-10 dataset. Such a project presents a powerful tool for the community, especially for those interested in deep learning and computer vision. Leveraging Amazon SageMaker, it provides a streamlined approach to build, train, and deploy a sophisticated image classification model using the CIFAR-10 dataset — a staple in the computer vision community for benchmarking the performance of new algorithms. The utilization of AWS services encapsulates the heavy lifting of infrastructure management, allowing developers and data scientists to focus on the core objective of training a robust model. Moreover, the application of a pre-built algorithm available in SageMaker makes the project highly accessible, even for individuals with limited experience in machine learning. It creates a rich ground for learning and experimentation, where one can easily manipulate variables such as the number of layers in the neural network or the type of instance used for training, to observe different outcomes, and perhaps garner insights that contribute to the broader knowledge base in the field of artificial intelligence.

In addition to being a learning avenue, this project fosters innovation and development of real-world solutions. Image recognition systems find applications in various domains including autonomous vehicles, healthcare, security, and e-commerce, to name a few. By showcasing a start-to-finish pipeline — from setting up an environment to making predictions on new data — it demystifies the complexities often associated with deploying machine learning models. Furthermore, it sets a foundation upon which more complex systems can be built; developers can extend this baseline model by integrating more advanced features, experimenting with different algorithms, or utilizing larger datasets for training, driving forward the advancements in image recognition technologies. Thus, this project serves as a valuable resource for the community, fostering education and innovation in the realm of computer vision.

This dataset is well-known in the machine learning community, containing images spread across 10 different categories, providing a diverse base for training our models.

So without further ado, let’s delve right into it!


What's Amazon SageMaker?

Amazon SageMaker is a fully managed machine learning (ML) service provided by Amazon Web Services (AWS), designed to enable developers and data scientists to build, train, and deploy machine learning models quickly and at scale. It abstracts and automates the complex processes usually encountered in the machine learning pipeline, making it easier to develop high-quality models.


Technical Requirements

  1. Amazon SageMaker: This fully managed service is central to the project as it facilitates the building, training, and deployment of the machine learning model.

  2. AWS Identity and Access Management (IAM): It is essential for setting up the necessary permissions that allow SageMaker to interact with other AWS services.

  3. Amazon S3: This service will be utilized for storing the dataset and the output results from the SageMaker training jobs.

  4. Python Environment: You will need a Python environment to run your Jupyter Notebook, inclusive of Python libraries such as MXNet, OpenCV, Boto3, and Torchvision which are required for various tasks including working with images and interacting with AWS services.

  5. AWS SDK for Python (Boto3): This SDK is essential for Python developers to write software that uses services like Amazon S3 and Amazon EC2.

  6. Amazon EC2: The instances from this service are used both in the training and deployment stages; specific instances used in the script are ml.p2.xlarge for training and ml.m4.xlarge for deployment.

  7. MXNet and Torchvision: These frameworks are required for working with neural networks and for loading and preparing the dataset respectively.

  8. OpenCV: This library is used for working with images, including reading and resizing images to the correct format.

  9. Runtime SageMaker: This endpoint is utilized to invoke the deployed model to make predictions on new data.

  10. CIFAR-10 Dataset: This dataset is integral to the project, offering a standardized collection of images for training and validating the image classification model.

  11. Jupyter Notebook: You would ideally be running this script in a Jupyter Notebook hosted on an Amazon SageMaker Notebook Instance.


Training and Validation

Building, Training, and Validation

Model Selection
In our project, we utilized SageMaker's built-in image classification algorithm. However, SageMaker supports various algorithms and frameworks, empowering you to choose the one that suits your task best.

Hyperparameter Tuning
Set the necessary hyperparameters for the algorithm. In our script, the number of layers was set to 18, and the model was trained for 10 epochs. Fine-tuning these hyperparameters can help in achieving better accuracy.

Training
Initiate the training process by defining the training inputs and fitting the model using the SageMaker Estimator. The model leverages the training and validation datasets to learn the underlying patterns and optimize its parameters.

During training, SageMaker automatically manages the underlying infrastructure, scaling it as necessary to ensure the training process is both fast and cost-effective.

Validation
After training, it’s essential to validate your model using a validation dataset to ensure it can generalize well to new, unseen data. In our script, we have separated a portion of the CIFAR-10 dataset for validation during the training process.


Time to Code it Up!

Step 1: Setting Up
Ensure you have an AWS account with SageMaker and S3 services enabled. Spin up a SageMaker notebook and get ready to input code. Install the necessary Python packages in your Jupyter notebook on SageMaker by running:

!pip install mxnet opencv-python boto3 torchvision

We can then import all the dependencies we require to use in this practical.

import os
import boto3
import sagemaker
import numpy as np
import mxnet as mx
from sagemaker import get_execution_role
from sagemaker.amazon.amazon_estimator import get_image_uri
from sagemaker.inputs import TrainingInput
from mxnet.gluon.data.vision import datasets, transforms
from mxnet import recordio
from PIL import Image
from torchvision import datasets, transforms
import io
Enter fullscreen mode Exit fullscreen mode

Firstly, the get_execution_role() function is invoked to retrieve the AWS Identity and Access Management (IAM) role associated with the current SageMaker notebook instance, establishing the permissions that SageMaker has for accessing other AWS services. Next, a SageMaker session is initiated using sagemaker.Session(), which will store the configuration state and facilitate interactions with SageMaker services, embodying the SageMaker environment where operations such as training and deployment will occur. Lastly, the AWS region connected to the current session is determined through boto3.Session().region_name, designating the geographical AWS region where the SageMaker resources will be allocated.

role = get_execution_role()
sess = sagemaker.Session()
region = boto3.Session().region_name
Enter fullscreen mode Exit fullscreen mode

Step 2: Fetch and Convert the CIFAR-10 Dataset
Fetch the CIFAR-10 dataset and convert it to the RecordIO format, leveraging mxnet and torchvision for data transformation. RecordIO is a preferred format for SageMaker due to its efficiency in storage and streaming.

We can use this script to download and convert the CIFAR10 data and save it on the notebook.

def fetch_and_convert_to_recordio():
    training_dataset = datasets.CIFAR10(root='data', train=True, transform=transforms.ToTensor(), download=True)
    validation_dataset = datasets.CIFAR10(root='data', train=False, transform=transforms.ToTensor())

    os.makedirs('data/train', exist_ok=True)
    os.makedirs('data/validation', exist_ok=True)

    # Convert Training Dataset to RecordIO
    record = recordio.MXRecordIO('data/train/train.rec', 'w')
    for i in range(len(training_dataset)):
        data, label = training_dataset[i]
        data = (data.numpy() * 255).astype(np.uint8).transpose(1, 2, 0)  # Convert to numpy array
        header = recordio.IRHeader(0, label, i, 0)
        s = recordio.pack_img(header, data, quality=100)
        record.write(s)
    record.close()

    # Convert Validation Dataset to RecordIO
    record = recordio.MXRecordIO('data/validation/validation.rec', 'w')
    for i in range(len(validation_dataset)):
        data, label = validation_dataset[i]
        data = (data.numpy() * 255).astype(np.uint8).transpose(1, 2, 0)  # Convert to numpy array
        header = recordio.IRHeader(0, label, i, 0)
        s = recordio.pack_img(header, data, quality=100)
        record.write(s)
    record.close()

fetch_and_convert_to_recordio()
Enter fullscreen mode Exit fullscreen mode

Afterwards, we can then upload the data to Amazon S3. In this case, we will not specify a specific S3 bucket and let SageMaker upload to a default bucket.

# Upload the data to S3
training_data = sess.upload_data(path='data/train', key_prefix='data/cifar10/train')
validation_data = sess.upload_data(path='data/validation', key_prefix='data/cifar10/validation')
Enter fullscreen mode Exit fullscreen mode

Step 3: Training the Model
We use the built-in image classification algorithm of SageMaker for this project. Set the hyperparameters like the number of layers, image shape, and epochs as needed for your task.

Now, we use the get_image_uri function to retrieve the URI (Uniform Resource Identifier) for a pre-built Docker image that contains the SageMaker image classification algorithm. The region parameter specifies the AWS region where the Amazon Elastic Container Registry (ECR) hosting the Docker image is located. The second parameter, 'image-classification', indicates that the Docker image corresponding to SageMaker's built-in image classification algorithm should be retrieved. The repo_version='latest' parameter indicates that the latest version of this Docker image should be retrieved. The URI is stored in the training_image variable, which can later be used to reference this Docker image when setting up a SageMaker training job. This function is a part of the SageMaker Python SDK, and it helps to streamline the process of setting up training jobs by allowing developers to easily access pre-built algorithm containers hosted in ECR.

training_image = get_image_uri(region, 'image-classification', repo_version="latest")
Enter fullscreen mode Exit fullscreen mode

Now, we let the SageMaker estimator be created using the sagemaker.estimator.Estimator class, which is utilized to configure and initiate training jobs in AWS SageMaker. The training_image parameter specifies the URI of the Docker image to be used for training, which contains the algorithm (in this case, an image classification algorithm) necessary for the training process. The role parameter is the AWS IAM role that grants SageMaker permissions to access resources in your AWS account. The instance_count and instance_type parameters dictate that the training job will be run on one ml.p2.xlarge instance, specifying the hardware to be used for the training. Lastly, the sagemaker_session parameter is used to pass the SageMaker session object, sess, that contains the session's configuration and credentials. This image_classifier object can later be used to set hyperparameters, fit the model to the training data, and deploy the model to a production endpoint.

image_classifier = sagemaker.estimator.Estimator(training_image,
                                                 role,
                                                 instance_count=1,
                                                 instance_type='ml.p2.xlarge',
                                                 sagemaker_session=sess)
Enter fullscreen mode Exit fullscreen mode

Now, we call set_hyperparameters method on the image_classifier object, which is an instance of SageMaker's Estimator class, to specify the hyperparameters for the training job. The num_layers parameter is set to 18, defining the number of layers in the neural network. The image_shape parameter is set to '3,32,32', indicating that the input images have a depth of 3 (color images) and are 32x32 pixels in size. The num_classes parameter is set to 10, specifying the number of different classes or categories the model should recognize. The num_training_samples parameter is set to 50,000, denoting the number of training samples available. Finally, the epochs parameter is set to 10, determining the number of passes over the entire training dataset the machine learning algorithm will complete. This configuration essentially fine-tunes the training process, potentially affecting the performance and accuracy of the final model.

image_classifier.set_hyperparameters(num_layers=18,
                                     image_shape='3,32,32',
                                     num_classes=10,
                                     num_training_samples=50000,
                                     epochs=10)
Enter fullscreen mode Exit fullscreen mode

We then create two TrainingInput objects to specify the configurations for the training and validation datasets that are to be used in a SageMaker training job. The training_data and validation_data parameters are S3 URIs pointing to the respective datasets. The distribution parameter is set to 'FullyReplicated', which means each training instance will receive the entire dataset. The content_type parameter is set to 'application/x-recordio', indicating the MIME type of the data. The s3_data_type parameter is set to 'S3Prefix', indicating that the S3 URIs point to prefixes in S3. After specifying the training and validation data inputs, a dictionary named data_channels is created to hold them, using the keys 'train' and 'validation'. Finally, the fit method is called on the image_classifier object (an instance of a SageMaker Estimator) with the data_channels dictionary passed as the inputs argument to initiate the training job, and logs parameter set to True to enable logging. This method starts the model training process with the defined data channels and hyperparameters, with real-time logs providing insights into the training process.

train_data = TrainingInput(training_data, distribution='FullyReplicated', 
                           content_type='application/x-recordio', s3_data_type='S3Prefix')
validation_data = TrainingInput(validation_data, distribution='FullyReplicated', 
                                 content_type='application/x-recordio', s3_data_type='S3Prefix')

data_channels = {'train': train_data, 'validation': validation_data}

image_classifier.fit(inputs=data_channels, logs=True)
Enter fullscreen mode Exit fullscreen mode

Step 4: Deploying the Model
Once the model is trained, deploy it on a SageMaker endpoint. You can use various instance types depending on your preferences and requirements.

predictor = image_classifier.deploy(instance_type='ml.m4.xlarge', initial_instance_count=1)
Enter fullscreen mode Exit fullscreen mode

Step 5: Making Predictions
After deploying the model, it's time to make predictions. Set up ensure the endpoint is up to process an input image and receive the prediction results in terms of the class probabilities.

An image of a dog is loaded from the specified file path using the cv2.imread() method and resized to 32x32 pixels to match the input size expected by the pre-trained model. The image is then encoded to a byte array, which is a format suitable for sending over a network. Following this, a boto3 SageMaker runtime client is instantiated to invoke an endpoint hosted on SageMaker, which likely hosts a deployed image classification model. The resized and encoded image is sent to this endpoint using a binary payload (specified by the 'application/x-image' content type) in a request, and the model hosted at the endpoint will process this input and return a prediction in the response. The specific SageMaker endpoint being invoked is identified by its name, which appears to be a timestamped identifier generated at the time of its creation. This code essentially tests the SageMaker endpoint by sending a single image to it and waiting to receive a classification prediction in response.

import cv2
import boto3
import ast
import numpy as np

# Load image
img = cv2.imread('data/images/dog.jpg')
img = cv2.resize(img, (32, 32)) # Resize the image to the expected size

# Convert image to byte array
_, img_encoded = cv2.imencode('.jpg', img)
img_bytes = img_encoded.tostring()

# Use boto3 to send request
client = boto3.client('runtime.sagemaker')
response = client.invoke_endpoint(
    EndpointName='image-classification-2023-08-31-13-46-20-134',
    Body=img_bytes,
    ContentType='application/x-image',
)
Enter fullscreen mode Exit fullscreen mode

Finally, we then define the CIFAR-10 classes use the response from the endpoint to predict which class it falls under.

# Define the CIFAR-10 classes
classes = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

# Assume response is the response from the SageMaker endpoint
response_body = response['Body'].read()
print(response_body)
response_string = response_body.decode('utf-8')
probabilities = ast.literal_eval(response_string)

# Get the index with the maximum probability
predicted_class_index = np.argmax(probabilities)

# Get the name of the predicted class
predicted_class_name = classes[predicted_class_index]

# Print the predicted class
print(f"The model predicts the image belongs to class: {predicted_class_name} ({predicted_class_index})")
Enter fullscreen mode Exit fullscreen mode

Step 6: Cleanup
Remember to delete the endpoint and the SageMaker notebook after your experiments to avoid unnecessary charges.

Comparing with Other Algorithms

While the built-in image classification algorithm offers a quick and easy way to set up a classification model, there are other algorithms such as VGG, ResNet, and custom CNN architectures which might offer better performance albeit potentially at a higher computational cost. You might explore these to possibly attain higher accuracy on your model.


Future Directions

Looking ahead, there are numerous pathways to enhance this project:

Hyperparameter Tuning: Leveraging SageMaker’s hyperparameter tuning functionality can assist in finding the optimal set of hyperparameters, enhancing model performance.
Transfer Learning: You might consider using transfer learning to leverage pre-trained models and fine-tune them for your specific task.
Enhanced Preprocessing: Augment the data preprocessing step with more sophisticated techniques to potentially improve the model’s generalization.
Deployment Options: Explore different deployment options such as deploying your model as a microservice using AWS Lambda and API Gateway for a more scalable and maintainable solution.


Conclusion

In conclusion, this project showcases the ease of use and powerful functionalities of Amazon SageMaker in building and deploying machine learning models. It provides a fantastic starting point for developers looking to dive into machine learning on AWS.

We encourage developers to take this project further by experimenting with different algorithms, optimizations, and deployment strategies. Let’s continue pushing the boundaries of what we can achieve with AWS!

Feel free to share your experiences, insights, or ask questions in the comment section below.


The Code

For the code, you can find the Jupyter Notebook at https://github.com/renaldig/image-recognition/blob/master/Image%20Recognition%20Notebook.ipynb.

Top comments (0)