
- Amazon SageMaker - Home
- Amazon SageMaker - Introduction
- How to Setup Amazon SageMaker?
- Amazon SageMaker - Building ML Models
- Amazon SageMaker - Training ML Models
- Amazon SageMaker - Deploying ML Models
- Amazon SageMaker - Monitoring & Optimizing
- Amazon SageMaker - Pricing
Amazon SageMaker Useful Resources
Amazon SageMaker - Training ML Models
You can easily train machine learning models by using Amazon SageMakers fully managed training service.
To train a ML model, you can either use Amazon SageMaker's built-in algorithms or use our own model. In both the cases, Amazon SageMaker allows you to run training jobs efficiently.
How to Train Models Using Amazon SageMaker?
Lets understand how you can train models using Amazon SageMaker with the help of below given Python program −
Step 1: Prepare Your Data
First, prepare your data and store it in Amazon S3 in CSV format or any other suitable format. Amazon SageMaker reads data from S3 for training jobs.
Step 2: Define the Estimator
Now, you need to define the estimator. You can use the Estimator object to configure the training job. For this example, we'll train a model using the built-in XGBoost algorithm as follows −
import SageMaker from SageMaker import get_execution_role from SageMaker.inputs import TrainingInput # Define your Amazon SageMaker session and role session = SageMaker.Session() role = get_execution_role() # Define the XGBoost estimator xgboost = SageMaker.estimator.Estimator( image_uri=SageMaker.image_uris.retrieve("xgboost", session.boto_region_name), role=role, instance_count=1, instance_type="ml.m4.xlarge", output_path=f"s3://your-bucket/output", SageMaker_session=session, ) # Set hyperparameters xgboost.set_hyperparameters(objective="binary:logistic", num_round=100)
Step 3: Specify Training Data
We need to specify the training data for further processing. You can use the TrainingInput class to specify the location of your data in S3 as follows −
# Specify training data in S3 train_input = TrainingInput (s3_data="s3://your-bucket/train", content_type="csv") validation_input = TrainingInput (s3_data="s3://your-bucket/validation", content_type="csv")
Step 4: Train the Model
Finally, start the training job by calling the fit method as follows −
# Train the model xgboost.fit({"train": train_input, "validation": validation_input})
Once trained, Amazon SageMaker will automatically provision resources, run the training job, and save the model output to the specified S3 location.
Distributed Training with Amazon SageMaker
Amazon SageMaker supports distributed training which enables you to scale training across multiple instances. This is useful in case when you are dealing with large datasets or deep learning models. Amazon SageMaker provides frameworks like TensorFlow and PyTorch that support distributed training.
To enable distributed training, you can increase the instance_count parameter in the Estimator object.
Example
Given below is an example using TensorFlow −
from SageMaker.tensorflow import TensorFlow # Define the TensorFlow estimator with distributed training tensorflow_estimator = TensorFlow( entry_point="train.py", role=role, instance_count=2, instance_type="ml.p3.2xlarge", framework_version="2.3", py_version="py37", ) # Train the model on multiple instances tensorflow_estimator.fit({"train": train_input, "validation": validation_input})
In this example, Amazon SageMaker uses two ml.p3.2xlarge instances for distributed training. It will reduce the training time for large models.