Blog

Nova Forge SDK series part 2: Practical guide to fine-tune Nova models using data mixing capabilities

Nova Forge SDK series part 2: Practical guide to fine-tune Nova models using data mixing capabilities

This hands-on guide walks through every step of fine-tuning an Amazon Nova model with the Amazon Nova Forge SDK, from data preparation to training with data mixing to evaluation, giving you a repeatable playbook you can adapt to your own use case. This is the second part in our Nova Forge SDK series, building on …

Nova Forge SDK series part 2: Practical guide to fine-tune Nova models using data mixing capabilities Read More »

From hours to minutes: How Agentic AI gave marketers time back for what matters

From hours to minutes: How Agentic AI gave marketers time back for what matters

Your marketing team loses hours to page assembly, coordination emails, and review cycles. These manual workflows keep teams from their most important work: identifying what problems customers face, crafting messages that resonate, and building campaigns that drive meaningful engagement. In this post, we share how AWS Marketing’s Technology, AI, and Analytics (TAA) team worked with …

From hours to minutes: How Agentic AI gave marketers time back for what matters Read More »

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference

Text-to-SQL generation remains a persistent challenge in enterprise AI applications, particularly when working with custom SQL dialects or domain-specific database schemas. While foundation models (FMs) demonstrate strong performance on standard SQL, achieving production-grade accuracy for specialized dialects requires fine-tuning. However, fine-tuning introduces an operational trade-off: hosting custom models on persistent infrastructure incurs continuous costs, even during …

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference Read More »

Transform retail with AWS generative AI services

Transform retail with AWS generative AI services

Online retailers face a persistent challenge: shoppers struggle to determine the fit and look when ordering online, leading to increased returns and decreased purchase confidence. The cost? Lost revenue, operational overhead, and customer frustration. Meanwhile, consumers increasingly expect immersive, interactive shopping experiences that bridge the gap between online and in-store retail. Retailers implementing virtual try-on …

Transform retail with AWS generative AI services Read More »

How Automated Reasoning checks in Amazon Bedrock transform generative AI compliance

How Automated Reasoning checks in Amazon Bedrock transform generative AI compliance

Compliance teams in regulated industries spend weeks on manual reviews, pay for outside consultants, and still face audit gaps when AI outputs lack formal proof. Automated Reasoning checks in Amazon Bedrock Guardrails address this by replacing probabilistic AI validation with mathematical verification, turning AI-generated decisions into provably correct, auditable results. In this post, you’ll learn …

How Automated Reasoning checks in Amazon Bedrock transform generative AI compliance Read More »

Create rich, custom tooltips in Amazon Quick Sight

Create rich, custom tooltips in Amazon Quick Sight

Amazon Quick Sight, the business intelligence (BI) capability of Amazon Quick, is a unified BI service. It provides modern interactive dashboards, natural language querying, pixel-perfect reports, machine learning (ML) insights, and embedded analytics at scale. Amazon Quick brings together AI agents for business insights, research, and automation in one integrated experience, helping you work smarter …

Create rich, custom tooltips in Amazon Quick Sight Read More »

Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM

Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM

Practical benchmarks showing faster inter-token latency when deploying Qwen3 models with vLLM, Kubernetes, and AWS AI Chips. Speculative decoding on AWS Trainium can accelerate token generation by up to 3x for decode-heavy workloads, helping reduce the cost per output token and improving throughput without sacrificing output quality. If you build AI writing assistants, coding agents, …

Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM Read More »

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.

Scroll to Top