Blog_dumb

Monitor embedding drift for LLMs deployed from Amazon SageMaker JumpStart

Monitor embedding drift for LLMs deployed from Amazon SageMaker JumpStart

One of the most useful application patterns for generative AI workloads is Retrieval Augmented Generation (RAG). In the RAG pattern, we find pieces of reference content related to an input prompt by performing similarity searches on embeddings. Embeddings capture the information content in bodies of text, allowing natural language processing (NLP) models to work with …

Monitor embedding drift for LLMs deployed from Amazon SageMaker JumpStart Read More »

Designing generative AI workloads for resilience

Designing generative AI workloads for resilience

Resilience plays a pivotal role in the development of any workload, and generative AI workloads are no different. There are unique considerations when engineering generative AI workloads through a resilience lens. Understanding and prioritizing resilience is crucial for generative AI workloads to meet organizational availability and business continuity requirements. In this post, we discuss the …

Designing generative AI workloads for resilience Read More »

Scroll to Top