Transition your Amazon Forecast usage to Amazon SageMaker Canvas

Amazon Forecast is a fully managed service that uses statistical and machine learning (ML) algorithms to deliver highly accurate time series forecasts. Launched in August 2019, Forecast predates Amazon SageMaker Canvas, a popular low-code no-code AWS tool for building, customizing, and deploying ML models, including time series forecasting models.

With SageMaker Canvas, you get faster model building, cost-effective predictions, advanced features such as a model leaderboard and algorithm selection, and enhanced transparency. You can also either use the SageMaker Canvas UI, which provides a visual interface for building and deploying models without needing to write any code or have any ML expertise, or use its automated machine learning (AutoML) APIs for programmatic interactions.

In this post, we provide an overview of the benefits SageMaker Canvas offers and details on how Forecast users can transition their use cases to SageMaker Canvas.

Benefits of SageMaker Canvas

Forecast customers have been seeking greater transparency, lower costs, faster training, and enhanced controls for building time series ML models. In response to this feedback, we have made next-generation time series forecasting capabilities available in SageMaker Canvas, which already offers a robust platform for preparing data and building and deploying ML models. With the addition of forecasting, you can now access end-to-end ML capabilities for a broad set of model types—including regression, multi-class classification, computer vision (CV), natural language processing (NLP), and generative artificial intelligence (AI)—within the unified user-friendly platform of SageMaker Canvas.

SageMaker Canvas offers up to 50% faster model building performance and up to 45% quicker predictions on average for time series models compared to Forecast across various benchmark datasets. Generating predictions is significantly more cost-effective than Forecast, because costs are based solely on the Amazon SageMaker compute resources used. SageMaker Canvas also provides excellent model transparency by offering direct access to trained models, which you can deploy at your chosen location, along with numerous model insight reports, including access to validation data, model- and item-level performance metrics, and hyperparameters employed during training.

SageMaker Canvas includes the key capabilities found in Forecast, including the ability to train an ensemble of forecasting models using both statistical and neural network algorithms. It creates the best model for your dataset by generating base models for each algorithm, evaluating their performance, and then combining the top-performing models into an ensemble. This approach leverages the strengths of different models to produce more accurate and robust forecasts. You have the flexibility to select one or several algorithms for model creation, along with the capability to evaluate the impact of model features on prediction accuracy. SageMaker Canvas simplifies your data preparation with automated solutions for filling in missing values, making your forecasting efforts as seamless as possible. It facilitates an out-of-the-box integration of external information, such as country-specific holidays, through simple UI options or API configurations. You can also take advantage of its data flow feature to connect with external data providers’ APIs to import data, such as weather information. Furthermore, you can conduct what-if analyses directly in the SageMaker Canvas UI to explore how various scenarios might affect your outcomes.

We will continue to innovate and deliver cutting-edge, industry-leading forecasting capabilities through SageMaker Canvas by lowering latency, reducing training and prediction costs, and improving accuracy. This includes expanding the range of forecasting algorithms we support and incorporating new advanced algorithms to further enhance the model building and prediction experience.

Transitioning from Forecast to SageMaker Canvas

Today, we’re releasing a transition package comprising two resources to help you transition your usage from Forecast to SageMaker Canvas. The first component includes a workshop to get hands-on experience with the SageMaker Canvas UI and APIs and to learn how to transition your usage from Forecast to SageMaker Canvas. We also provide a Jupyter notebook that shows how to transform your existing Forecast training datasets to the SageMaker Canvas format.

Before we learn how to build forecast models in SageMaker Canvas using your Forecast input datasets, let’s understand some key differences between Forecast and SageMaker Canvas:

Dataset types – Forecast uses multiple datasets – target time series, related time series (optional), and item metadata (optional). In contrast, SageMaker Canvas requires only one dataset, eliminating the need for managing multiple datasets.

Model invocation – SageMaker Canvas allows you to invoke the model for a single dataset or a batch of datasets using the UI as well as the APIs. Unlike Forecast, which requires you to first create a forecast and then query it, you simply use the UI or API to invoke the endpoint where the model is deployed to generate forecasts. The SageMaker Canvas UI also gives you the option to deploy the model for inference on SageMaker real-time endpoints. With just a few clicks, you can receive an HTTPS endpoint that can be invoked from within your application to generate forecasts.

In the following sections, we discuss the high-level steps for transforming your data, building a model, and deploying a model using SageMaker Canvas using either the UI or APIs.

Build and deploy a model using the SageMaker Canvas UI

We recommend reorganizing your data sources to directly create a single dataset for use with SageMaker Canvas. Refer to Time Series Forecasts in Amazon SageMaker Canvas for guidance on structuring your input dataset to build a forecasting model in SageMaker Canvas. However, if you prefer to continue using multiple datasets as you do in Forecast, you have the following options to merge them into a single dataset supported by SageMaker Canvas:

SageMaker Canvas UI – Use the SageMaker Canvas UI to join the target time series, related time series, and item metadata datasets into one dataset. The following screenshot shows an example dataflow created in SageMaker Canvas to merge the three datasets into one SageMaker Canvas dataset.

Python script – Use a Python script to merge the datasets. For sample code and hands-on experience in transforming multiple Forecast datasets into one dataset for SageMaker Canvas, refer to this workshop.

When the dataset is ready, use the SageMaker Canvas UI, available on the SageMaker console, to load the dataset into the SageMaker Canvas application, which uses AutoML to train, build, and deploy the model for inference. The workshop shows how to merge your datasets and build the forecasting model.

After the model is built, there are multiple ways to generate and consume forecasts:

Make an in-app prediction – You can generate forecasts using the SageMaker Canvas UI and export them to Amazon QuickSight using built-in integration or download the prediction file to your local desktop. You can also access the generated predictions from the Amazon Simple Storage Service (Amazon S3) storage location where SageMaker Canvas is configured to store model artifacts, datasets, and other application data. Refer to Configure your Amazon S3 storage to learn more about the Amazon S3 storage location used by SageMaker Canvas.
Deploy the model to a SageMaker endpoint – You can deploy the model to SageMaker real-time endpoints directly from the SageMaker Canvas UI. These endpoints can be queried by developers in their applications with a few lines of code. You can update the code in your existing application to invoke the deployed model. Refer to the workshop for more details.

Build and deploy a model using the SageMaker Canvas (Autopilot) APIs

You can use the sample code provided in the notebook in the GitHub repo to process your datasets, including target time series data, related time series data, and item metadata, into a single dataset needed by SageMaker Canvas APIs.

Next, use the SageMaker AutoML API for time series forecasting to process the data, train the ML model, and deploy the model programmatically. Refer to the sample notebook in the GitHub repo for a detailed implementation on how to train a time series model and produce predictions using the model.

Refer to the workshop for more hands-on experience.

Conclusion

In this post, we outlined steps to transition from Forecast and build time series ML models in SageMaker Canvas, and provided a data transformation notebook and prescriptive guidance through a workshop. After the transition, you can benefit from a more accessible UI, cost-effectiveness, and higher transparency of the underlying AutoML API in SageMaker Canvas, democratizing time series forecasting within your organization and saving time and resources on model training and deployment.

SageMaker Canvas can be accessed from the SageMaker console. Time series forecasting with Canvas is available in all regions where SageMaker Canvas is available. For more information about AWS Region availability, see AWS Services by Region.

Resources

For more information, see the following resources:

Refer to the workshop to get hands-on experience on using SageMaker Canvas
Refer to Time Series Forecasts in Amazon SageMaker Canvas for information on time series forecasting using the SageMaker Canvas UI
Refer to Create an AutoML job for time-series forecasting using the API for information on time series forecasting using the SageMaker Canvas API
Refer to Time-Series Forecasting with Amazon SageMaker Autopilot on GitHub for a notebook showing a sample implementation to train a time series model and produce predictions using AutoML APIs
To learn how to include weather data in your forecasting model, see Use weather data to improve forecasts with Amazon SageMaker Canvas
To learn how to set up monitoring for your forecasting model for accuracy drift and automatically retrain the model based on the drift threshold, refer to Automated Time-series Performance Monitoring and Retraining using Amazon SageMaker Autopilot on GitHub

About the Authors

Nirmal Kumar is Sr. Product Manager for the Amazon SageMaker service. Committed to broadening access to AI/ML, he steers the development of no-code and low-code ML solutions. Outside work, he enjoys travelling and reading non-fiction.

Dan Sinnreich is a Sr. Product Manager for Amazon SageMaker, focused on expanding no-code / low-code services. He is dedicated to making ML and generative AI more accessible and applying them to solve challenging problems. Outside of work, he can be found playing hockey, scuba diving, and reading science fiction.

Davide Gallitelli is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customer throughout Benelux. He has been a developer since very young, starting to code at the age of 7. He started learning AI/ML in his later years of university, and has fallen in love with it since then.

Biswanath Hore is a Solutions Architect at Amazon Web Services. He works with customers early in their AWS journey, helping them adopt cloud solutions to address their business needs. He is passionate about Machine Learning and, outside of work, loves spending time with his family.