Blog_dumb

Elevate customer experience by using the Amazon Q Business custom plugin for New Relic AI

Elevate customer experience by using the Amazon Q Business custom plugin for New Relic AI

Digital experience interruptions can harm customer satisfaction and business performance across industries. Application failures, slow load times, and service unavailability can lead to user frustration, decreased engagement, and revenue loss. The risk and impact of outages increase during peak usage periods, which vary by industry—from ecommerce sales events to financial quarter-ends or major product launches. …

Elevate customer experience by using the Amazon Q Business custom plugin for New Relic AI Read More »

Amazon SageMaker launches the updated inference optimization toolkit for generative AI

Amazon SageMaker launches the updated inference optimization toolkit for generative AI

Today, Amazon SageMaker is excited to announce updates to the inference optimization toolkit, providing new functionality and enhancements to help you optimize generative AI models even faster. These updates build on the capabilities introduced in the original launch of the inference optimization toolkit (to learn more, see Achieve up to ~2x higher throughput while reducing …

Amazon SageMaker launches the updated inference optimization toolkit for generative AI Read More »

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

This post was written with Zach Marston and Serg Masis from Syngenta. Syngenta and AWS collaborated to develop Cropwise AI, an innovative solution powered by Amazon Bedrock Agents, to accelerate their sales reps’ ability to place Syngenta seed products with growers across North America. Cropwise AI harnesses the power of generative AI using AWS to …

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents Read More »

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker

This post is co-written with Abhishek Sawarkar, Eliuth Triana, Jiahong Liu and Kshitiz Gupta from NVIDIA.  At re:Invent 2024, we are excited to announce new capabilities to speed up your AI inference workloads with NVIDIA accelerated computing and software offerings on Amazon SageMaker. These advancements build upon our collaboration with NVIDIA, which includes adding support …

Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker Read More »

Unlock cost savings with the new scale down to zero feature in SageMaker Inference

Unlock cost savings with the new scale down to zero feature in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce a new feature for Amazon SageMaker inference endpoints: the ability to scale SageMaker inference endpoints to zero instances. This long-awaited capability is a game changer for our customers using the power of AI and machine learning (ML) inference in the cloud. Previously, SageMaker inference endpoints …

Unlock cost savings with the new scale down to zero feature in SageMaker Inference Read More »

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference

Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI  models for inference. This innovation allows you to scale your models faster, observing up to 56% reduction in latency when scaling a new model copy and up …

Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference Read More »

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1

The generative AI landscape has been rapidly evolving, with large language models (LLMs) at the forefront of this transformation. These models have grown exponentially in size and complexity, with some now containing hundreds of billions of parameters and requiring hundreds of gigabytes of memory. As LLMs continue to expand, AI engineers face increasing challenges in …

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1 Read More »

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – Part 2

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – Part 2

In Part 1 of this series, we introduced Amazon SageMaker Fast Model Loader, a new capability in Amazon SageMaker that significantly reduces the time required to deploy and scale large language models (LLMs) for inference. We discussed how this innovation addresses one of the major bottlenecks in LLM deployment: the time required to load massive models …

Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – Part 2 Read More »

Fast and accurate zero-shot forecasting with Chronos-Bolt and AutoGluon

Fast and accurate zero-shot forecasting with Chronos-Bolt and AutoGluon

Chronos-Bolt is the newest addition to AutoGluon-TimeSeries, delivering accurate zero-shot forecasting up to 250 times faster than the original Chronos models [1]. Time series forecasting plays a vital role in guiding key business decisions across industries such as retail, energy, finance, and healthcare. Traditionally, forecasting has relied on statistical models [2] like ETS and ARIMA, …

Fast and accurate zero-shot forecasting with Chronos-Bolt and AutoGluon Read More »

Scroll to Top