Our latest investment in open source security for the AI era
Google is making new investments, building new tools and developing code security to improve open source security.
Google is making new investments, building new tools and developing code security to improve open source security.
We’re expanding Personal Intelligence across AI Mode in Search, the Gemini app and Gemini in Chrome.
This post is co-written with Mark Ross from Atos. Organizations pursuing AI transformation can face a familiar challenge: how to upskill their workforce at scale in a way that changes how teams build, deploy, and use AI. Traditional AI training approaches—online courses, certification programs, and classroom-based instruction—are necessary, but often insufficient. While they build foundational …
AWS AI League: Atos fine-tunes approach to AI education Read More »
AI is moving fast, and for most of our customers, the real opportunity isn’t in experimenting with it—it’s in running AI in production where it drives meaningful business outcomes. This means building systems that run reliably, perform at scale, and meet your organization’s security and compliance requirements. Today at NVIDIA GTC 2026, AWS and NVIDIA …
AWS and NVIDIA deepen strategic collaboration to accelerate AI from pilot to production Read More »
This is Part II of a two-part series from the AWS Generative AI Innovation Center. If you missed Part I, refer to Operationalizing Agentic AI Part 1: A Stakeholder’s Guide. The biggest barrier to agentic AI isn’t the technology—it’s the operating model. In Part I, we established that organizations generating real value from agents share …
Agentic AI in the Enterprise Part 2: Guidance by Persona Read More »
We thank Greg Pereira and Robert Shaw from the llm-d team for their support in bringing llm-d to AWS. In the agentic and reasoning era, large language models (LLMs) generate 10x more tokens and compute through complex reasoning chains compared to single-shot replies. Agentic AI workflows also create highly variable demands and another exponential increase in …
Introducing Disaggregated Inference on AWS powered by llm-d Read More »
This post is cowritten with Ilija Subanovic and Michael Rice from Workhuman. Workhuman’s customer service and analytics team were drowning in one-time reporting requests from seven million users worldwide—a common challenge with legacy reporting tools at scale. Business intelligence (BI) admins faced mounting pressure as their teams became overwhelmed with these requests. By rebuilding their …
Building and managing machine learning (ML) features at scale is one of the most critical and complex challenges in modern data science workflows. Organizations often struggle with fragmented feature pipelines, inconsistent data definitions, and redundant engineering efforts across teams. Without a centralized system for storing and reusing features, models risk being trained on outdated or …
EAGLE is the state-of-the-art method for speculative decoding in large language model (LLM) inference, but its autoregressive drafting creates a hidden bottleneck: the more tokens that you speculate, the more sequential forward passes the drafter needs. Eventually those overhead eats into your gains. P-EAGLE removes this ceiling by generating all K draft tokens in a …
P-EAGLE: Faster LLM inference with Parallel Speculative Decoding in vLLM Read More »
As organizations scale their generative AI workloads on Amazon Bedrock, operational visibility into inference performance and resource consumption becomes critical. Teams running latency-sensitive applications must understand how quickly models begin generating responses. Teams managing high-throughput workloads must understand how their requests consume quota so they can avoid unexpected throttling. Until now, gaining this visibility required …