This post is co-written with Ilan Geller, Shuyu Yang and Richa Gupta from Accenture.
Bringing innovative new pharmaceuticals drugs to market is a long and stringent process. Companies face complex regulations and extensive approval requirements from governing bodies like the US Food and Drug Administration (FDA). A key part of the submission process is authoring regulatory documents like the Common Technical Document (CTD), a comprehensive standard formatted document for submitting applications, amendments, supplements, and reports to the FDA. This document contains over 100 highly detailed technical reports created during the process of drug research and testing. Manually creating CTDs is incredibly labor-intensive, requiring up to 100,000 hours per year for a typical large pharma company. The tedious process of compiling hundreds of documents is also prone to errors.
Accenture built a regulatory document authoring solution using automated generative AI that enables researchers and testers to produce CTDs efficiently. By extracting key data from testing reports, the system uses Amazon SageMaker JumpStart and other AWS AI services to generate CTDs in the proper format. This revolutionary approach compresses the time and effort spent on CTD authoring. Users can quickly review and adjust the computer-generated reports before submission.
Because of the sensitive nature of the data and effort involved, pharmaceutical companies need a higher level of control, security, and auditability. This solution relies on the AWS Well-Architected principles and guidelines to enable the control, security, and auditability requirements. The user-friendly system also employs encryption for security.
By harnessing AWS generative AI, Accenture aims to transform efficiency for regulated industries like pharmaceuticals. Automating the frustrating CTD document process accelerates new product approvals so innovative treatments can get to patients faster. AI delivers a major leap forward.
This post provides an overview of an end-to-end generative AI solution developed by Accenture for regulatory document authoring using SageMaker JumpStart and other AWS services.
Solution overview
Accenture built an AI-based solution that automatically generates a CTD document in the required format, along with the flexibility for users to review and edit the generated content. The preliminary value is estimated at a 40–45% reduction in authoring time.
This generative AI-based solution extracts information from the technical reports produced as part of the testing process and delivers the detailed dossier in a common format required by the central governing bodies. Users then review and edit the documents, where necessary, and submit the same to the central governing bodies. This solution uses the SageMaker JumpStart AI21 Jurassic Jumbo Instruct and AI21 Summarize models to extract and create the documents.
The following diagram illustrates the solution architecture.
The workflow consists of the following steps:
A user accesses the regulatory document authoring tool from their computer browser.
A React application is hosted on AWS Amplify and is accessed from the user’s computer (for DNS, use Amazon Route 53).
The React application uses the Amplify authentication library to detect whether the user is authenticated.
Amazon Cognito provides a local user pool or can be federated with the user’s active directory.
The application uses the Amplify libraries for Amazon Simple Storage Service (Amazon S3) and uploads documents provided by users to Amazon S3.
The application writes the job details (app-generated job ID and Amazon S3 source file location) to an Amazon Simple Queue Service (Amazon SQS) queue. It captures the message ID returned by Amazon SQS. Amazon SQS enables a fault-tolerant decoupled architecture. Even if there are some backend errors while processing a job, having a job record inside Amazon SQS will ensure successful retries.
Using the job ID and message ID returned by the previous request, the client connects to the WebSocket API and sends the job ID and message ID to the WebSocket connection.
The WebSocket triggers an AWS Lambda function, which creates a record in Amazon DynamoDB. The record is a key-value mapping of the job ID (WebSocket) with the connection ID and message ID.
Another Lambda function gets triggered with a new message in the SQS queue. The Lambda function reads the job ID and invokes an AWS Step Functions workflow for processing data files.
The Step Functions state machine invokes a Lambda function to process the source documents. The function code invokes Amazon Textract to analyze the documents. The response data is stored in DynamoDB. Based on specific requirements with processing data, it can also be stored in Amazon S3 or Amazon DocumentDB (with MongoDB compatibility).
A Lambda function invokes the Amazon Textract API DetectDocument to parse tabular data from source documents and stores extracted data into DynamoDB.
A Lambda function processes the data based on mapping rules stored in a DynamoDB table.
A Lambda function invokes the prompt libraries and a series of actions using generative AI with a large language model hosted through Amazon SageMaker for data summarization.
The document writer Lambda function writes a consolidated document in an S3 processed folder.
The job callback Lambda function retrieves the callback connection details from the DynamoDB table, passing the job ID. Then the Lambda function makes a callback to the WebSocket endpoint and provides the processed document link from Amazon S3.
A Lambda function deletes the message from the SQS queue so that it’s not reprocessed.
A document generator web module converts the JSON data into a Microsoft Word document, saves it, and renders the processed document on the web browser.
The user can view, edit, and save the documents back to the S3 bucket from the web module. This helps in reviews and corrections needed, if any.
The solution also uses SageMaker notebooks (labeled T in the preceding architecture) to perform domain adaption, fine-tune the models, and deploy the SageMaker endpoints.
Conclusion
In this post, we showcased how Accenture is using AWS generative AI services to implement an end-to-end approach towards a regulatory document authoring solution. This solution in early testing has demonstrated a 60–65% reduction in the time required for authoring CTDs. We identified the gaps in traditional regulatory governing platforms and augmented generative intelligence within its framework for faster response times, and are continuously improving the system while engaging with users across the globe. Reach out to the Accenture Center of Excellence team to dive deeper into the solution and deploy it for your clients.
This joint program focused on generative AI will help increase the time-to-value for joint customers of Accenture and AWS. The effort builds on the 15-year strategic relationship between the companies and uses the same proven mechanisms and accelerators built by the Accenture AWS Business Group (AABG).
Connect with the AABG team at accentureaws@amazon.com to drive business outcomes by transforming to an intelligent data enterprise on AWS.
For further information about generative AI on AWS using Amazon Bedrock or SageMaker, refer to Generative AI on AWS: Technology and Get started with generative AI on AWS using Amazon SageMaker JumpStart.
You can also sign up for the AWS generative AI newsletter, which includes educational resources, blogs, and service updates.
About the Authors
Ilan Geller is a Managing Director in the Data and AI practice at Accenture. He is the Global AWS Partner Lead for Data and AI and the Center for Advanced AI. His roles at Accenture have primarily been focused on the design, development, and delivery of complex data, AI/ML, and most recently Generative AI solutions.
Shuyu Yang is Generative AI and Large Language Model Delivery Lead and also leads CoE (Center of Excellence) Accenture AI (AWS DevOps professional) teams.
Richa Gupta is a Technology Architect at Accenture, leading various AI projects. She comes with 18+ years of experience in architecting Scalable AI and GenAI solutions. Her expertise area is on AI architecture, Cloud Solutions and Generative AI. She plays and instrumental role in various presales activities.
Shikhar Kwatra is an AI/ML Specialist Solutions Architect at Amazon Web Services, working with a leading Global System Integrator. He has earned the title of one of the Youngest Indian Master Inventors with over 500 patents in the AI/ML and IoT domains. Shikhar aids in architecting, building, and maintaining cost-efficient, scalable cloud environments for the organization, and supports the GSI partner in building strategic industry solutions on AWS. Shikhar enjoys playing guitar, composing music, and practicing mindfulness in his spare time.
Sachin Thakkar is a Senior Solutions Architect at Amazon Web Services, working with a leading Global System Integrator (GSI). He brings over 23 years of experience as an IT Architect and as Technology Consultant for large institutions. His focus area is on Data, Analytics and Generative AI. Sachin provides architectural guidance and supports the GSI partner in building strategic industry solutions on AWS.