Blog_dumb

Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock

Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock

Organizations and individuals running multiple custom AI models, especially recent Mixture of Experts (MoE) model families, can face the challenge of paying for idle GPU capacity when the individual models don’t receive enough traffic to saturate a dedicated compute endpoint. To solve this problem, we have partnered with the vLLM community and developed an efficient …

Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock Read More »

Building intelligent event agents using Amazon Bedrock AgentCore and Amazon Bedrock Knowledge Bases

Building intelligent event agents using Amazon Bedrock AgentCore and Amazon Bedrock Knowledge Bases

Large conferences and events generate overwhelming amounts of information—from hundreds of sessions and workshops to speaker profiles, venue maps, and constantly updating schedules. While basic AI assistants can answer simple questions about event logistics, most fail to deliver the personalized guidance and contextual awareness that attendees need to navigate complex, multi-day conferences effectively. More importantly, …

Building intelligent event agents using Amazon Bedrock AgentCore and Amazon Bedrock Knowledge Bases Read More »

Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock

Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock

Managing large photo collections presents significant challenges for organizations and individuals. Traditional approaches rely on manual tagging, basic metadata, and folder-based organization, which can become impractical when dealing with thousands of images containing multiple people and complex relationships. Intelligent photo search systems address these challenges by combining computer vision, graph databases, and natural language processing …

Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock Read More »

Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs

Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs

The rapid advancement of artificial intelligence (AI) has created unprecedented demand for specialized models capable of complex reasoning tasks, particularly in competitive programming where models must generate functional code through algorithmic reasoning rather than pattern memorization. Reinforcement learning (RL) enables models to learn through trial and error by receiving rewards based on actual code execution, …

Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs Read More »

Generate structured output from LLMs with Dottxt Outlines in AWS

Generate structured output from LLMs with Dottxt Outlines in AWS

This post is cowritten with Remi Louf, CEO and technical founder of Dottxt. Structured output in AI applications refers to AI-generated responses conforming to formats that are predefined, validated, and often strictly entered. This can include the schema for the output, or ways specific fields in the output should be mapped. Structured outputs are essential …

Generate structured output from LLMs with Dottxt Outlines in AWS Read More »

Global cross-Region inference for latest Anthropic Claude Opus, Sonnet and Haiku models on Amazon Bedrock in Thailand, Malaysia, Singapore, Indonesia, and Taiwan

Global cross-Region inference for latest Anthropic Claude Opus, Sonnet and Haiku models on Amazon Bedrock in Thailand, Malaysia, Singapore, Indonesia, and Taiwan

Organizations across in Thailand, Malaysia, Singapore, Indonesia, and Taiwan can now access Anthropic Claude Opus 4.6, Sonnet 4.6, and Claude Haiku 4.5 through Global cross-Region inference (CRIS) on Amazon Bedrock—delivering foundation models through a globally distributed inference architecture designed for scale. Global CRIS offers three key advantages: higher quotas, cost efficiency, and intelligent request routing …

Global cross-Region inference for latest Anthropic Claude Opus, Sonnet and Haiku models on Amazon Bedrock in Thailand, Malaysia, Singapore, Indonesia, and Taiwan Read More »

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.

Scroll to Top