Blog_dumb

Accelerate custom LLM deployment: Fine-tune with Oumi and deploy to Amazon Bedrock

Accelerate custom LLM deployment: Fine-tune with Oumi and deploy to Amazon Bedrock

This post is cowritten by David Stewart and Matthew Persons from Oumi. Fine-tuning open source large language models (LLMs) often stalls between experimentation and production. Training configurations, artifact management, and scalable deployment each require different tools, creating friction when moving from rapid experimentation to secure, enterprise-grade environments. In this post, we show how to fine-tune …

Accelerate custom LLM deployment: Fine-tune with Oumi and deploy to Amazon Bedrock Read More »

Run NVIDIA Nemotron 3 Nano as a fully managed serverless model on Amazon Bedrock

Run NVIDIA Nemotron 3 Nano as a fully managed serverless model on Amazon Bedrock

This post is cowritten with Abdullahi Olaoye, Curtice Lockhart, Nirmal Kumar Juluru from NVIDIA. We are excited to announce that NVIDIA’s Nemotron 3 Nano is now available as a fully managed and serverless model in Amazon Bedrock. This follows our earlier announcement at AWS re:Invent supporting NVIDIA Nemotron 2 Nano 9B and NVIDIA Nemotron 2 …

Run NVIDIA Nemotron 3 Nano as a fully managed serverless model on Amazon Bedrock Read More »

Access Anthropic Claude models in India on Amazon Bedrock with Global cross-Region inference

Access Anthropic Claude models in India on Amazon Bedrock with Global cross-Region inference

The adoption and implementation of generative AI inference has increased with organizations building more operational workloads that use AI capabilities in production at scale. To help customers achieve the scale of their generative AI applications, Amazon Bedrock offers cross-Region inference (CRIS) profiles. CRIS is a powerful feature that organizations can use to seamlessly distribute inference …

Access Anthropic Claude models in India on Amazon Bedrock with Global cross-Region inference Read More »

Drive organizational growth with Amazon Lex multi-developer CI/CD pipeline

Drive organizational growth with Amazon Lex multi-developer CI/CD pipeline

As your conversational AI initiatives evolve, developing Amazon Lex assistants becomes increasingly complex. Multiple developers working on the same shared Lex instance leads to configuration conflicts, overwritten changes, and slower iteration cycles. Scaling Amazon Lex development requires isolated environments, version control, and automated deployment pipelines. By adopting well-structured continuous integration and continuous delivery (CI/CD) practices, …

Drive organizational growth with Amazon Lex multi-developer CI/CD pipeline Read More »

Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints

Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints

Organizations increasingly deploy custom large language models (LLMs) on Amazon SageMaker AI real-time endpoints using their preferred serving frameworks—such as SGLang, vLLM, or TorchServe—to help gain greater control over their deployments, optimize costs, and align with compliance requirements. However, this flexibility introduces a critical technical challenge: response format incompatibility with Strands agents. While these custom …

Building custom model provider for Strands Agents with LLMs hosted on SageMaker AI endpoints Read More »

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.

Scroll to Top