In today’s fast-paced world, time is of the essence and even basic tasks like grocery shopping can feel rushed and challenging. Despite our best intentions to plan meals and shop accordingly, we often end up ordering takeout; leaving unused perishable items to spoil in the refrigerator. This seemingly small issue of wasted groceries, paired with the about-to-perish grocery supplies thrown away by grocery stores, contributes significantly to the global food waste problem. This demonstrates how we can help solve this problem by harnessing the power of generative AI on AWS.
By using computer vision capabilities through Amazon Rekognition and the content generation capabilities offered by foundation models (FMs) available through Amazon Bedrock, we developed a solution that will recommend recipes based on what you already have in your refrigerator and an inventory of about-to-expire items in local supermarkets, making sure that both food in your home and food in grocery stores are used, saving money and reducing waste.
In this post, we walk through how to build the FoodSavr solution (fictitious name used for the purposes of this post) using Amazon Rekognition Custom Labels to detect the ingredients and generate personalized recipes using Anthropic’s Claude 3.0 on Amazon Bedrock. We demonstrate an end-to-end architecture where a user can upload an image of their fridge, and using the ingredients found there (detected by Amazon Rekognition), the solution will give them a list of recipes (generated by Amazon Bedrock). The architecture also recognizes missing ingredients and provides the user with a list of nearby grocery stores.
Solution overview
The following reference architecture shows how you can use Amazon Bedrock, Amazon Rekognition, and other AWS services to implement the FoodSavr solution.
As shown in the preceding figure, the architecture includes the following steps:
For an end-to-end solution, we recommend having a frontend where your users can upload images of items that they want detected and labeled. To learn more about frontend deployment on AWS, see Front-end Web & Mobile on AWS.
The picture taken by the user is stored in an Amazon Simple Storage Service (Amazon S3) This S3 bucket should be configured with a lifecycle policy that deletes the image after use. To learn more about S3 lifecycle policies, see Managing your storage lifecycle.
This architecture uses different AWS Lambda Lambda is a serverless AWS compute service that runs event driven code and automatically manages the compute resources. The first Lambda function, DetectIngredients harnesses the power of Amazon Rekognition by using the Boto3 Python API. Amazon Rekognition is a cutting-edge computer vision service that uses machine learning (ML) models to analyze the uploaded images.
We use Rekognition Custom Labels to train a model with a dataset of ingredients. You can adopt this architecture to use Rekognition Custom Labels with your own use case. With the aid of custom labels trained to recognize various ingredients, Amazon Rekognition identifies the items present in the images.
The detected ingredient names are then securely stored in an Amazon DynamoDB (a fully managed NoSQL database service) table. for retrieval and modification. Users are presented with list of the ingredients that have been detected, along with the option of adding other ingredients or deleting ingredients that they might not want or were misidentified.
After the ingredient list is confirmed by the user through the web interface, they can initiate the recipe generation process with a click of a button. This action invokes another Lambda function called GenerateRecipes, which uses the advanced language capabilities of the Amazon Bedrock API (Anthropic’s Claude v3 in this post). This state-of-the-art FM analyzes the confirmed ingredient list retrieved from DynamoDB and generates relevant recipes tailored to those specific ingredients. Additionally, the model provides images to accompany each recipe, providing a visually appealing and inspiring culinary experience.
Amazon Bedrock contains two key FMs that are used for this solution example: Anthropic’s Claude v3 (newer versions have been released since the writing of this post) and Stable Diffusion, used for recipe generation and image generation respectively. For this solution, you can use any combination of FMs that suit your use case. The generated content (recipes as text and recipe images, in this case) can then be displayed to the user on the frontend.
For this use case, you can also set up an optional ordering pipeline, which allows a user to place orders for the ingredients described by the FMs. This would be fronted by a Lambda function, FindGroceryItems, that can look for the recommended grocery items in a database contributed to by local supermarkets. This database would consist of about-to-expire ingredients along with prices for those ingredients.
In the following sections, we dive into how you can set up this architecture on your own account. Step 8 is optional and therefore not covered in this post.
Using Amazon Rekognition to detect images
The image recognition is powered by Amazon Rekognition, which offers pre-trained and customizable computer vision capabilities to allow users to obtain information and insights from their images. For customizability, you can use Rekognition Custom Labels to identify scenes and objects in your images that are specific to your business needs. If your images are already labeled, you can begin training a model from the Amazon Rekognition console. Otherwise, you can label them directly from the Amazon Rekognition labeling interface, or use other services such as Amazon SageMaker Ground Truth. The following screenshot shows an example of what the bounding box process would look like on the Amazon Rekognition labeling interface.
To get started with labeling, see Using Amazon Rekognition Custom Labels and Amazon A2I for detecting pizza slices and augmenting predictions. For this architecture, we collected a dataset of up to 70 images of common food items typically found in refrigerators. We recommend that you gather your own relevant images and store them in an S3 bucket to use for training with Amazon Rekognition. You can then use Rekognition Custom Labels to create labels with food names, and assign bounding boxes on the images so the model knows where to look. To get started with training your own custom model, see Training an Amazon Rekognition Custom Labels model.
When model training is complete, you will see all your trained models under Projects on the AWS Management Console for Amazon Rekognition. Here, you can also look at the model performance, measured by the F1 score (shown in the following screenshot).
You can also iterate and modify your existing models to create newer versions. Before using your model, make sure it’s in STARTED state. To use the model, choose the model you want to use, and on the Use model tab, choose Start.
You also have the option to programmatically start and stop your model (the exact API call can be copied from the Amazon Rekognition console, but the following is provided as an example):
Use the following API (which is present in the Lambda function) call to detect groceries in an image using your custom labels and custom models:
aws rekognition detect-custom-labels
–project-version-arn “MODEL_ARN”
–image ‘{“S3Object”: {“Bucket”: “MY_BUCKET”,”Name”: “PATH_TO_MY_IMAGE”}}’
–region us-east-1
To stop incurring costs, you can also stop your model when not in use:
aws rekognition stop-project-version
–project-version-arn “MODEL ARN
–region us-east-1
Because we’re using Python, the boto3 Python package is used to make all AWS API calls mentioned in this post. For more information about Boto3, see the Boto3 documentation.
Starting a model might take a few minutes to complete. To check the current status of the model readiness, check the details page for the project or use DescribeProjectVersions. Wait for the model status to change to RUNNING.
In the meantime, you can explore the different statistics provided by Amazon Rekognition about your model. Some notable ones are the model performance (F1 score), precision, and recall. These statistics are gathered by Amazon Rekognition at both the model level (as seen in the earlier screenshot) and the individual custom label level (as shown in the following screenshot).
For more information on these statistics, see Metrics for evaluating your model.
Be aware that, while Anthropic’s Claude models offer impressive multi-modal capabilities for understanding and generating content based on text and images, we chose to use Amazon Rekognition Custom Labels for ingredient detection in this solution. Amazon Rekognition is a specialized computer vision service optimized for tasks such as object detection and image classification, using state-of-the-art models trained on massive datasets. Additionally, Rekognition Custom Labels allows us to train custom models tailored to recognize specific food items and ingredients, providing a level of customization that might not be as straightforward with a general-purpose language model. Furthermore, as a fully managed service, Amazon Rekognition can scale seamlessly to handle large volumes of images. While a hybrid approach combining Rekognition and Claude’s multi-modal capabilities could be explored, we chose Rekognition Custom Labels for its specialized computer vision capabilities, customizability, and to demonstrate combining FMs on Amazon Bedrock with other AWS services for this specific use case.
Using Amazon Bedrock FMs to generate recipes
To generate the recipes, we use Amazon Bedrock, a fully managed service that offers high-performing FMs. We use the Amazon Bedrock API to query Anthropic’s Claude v3 Sonnet model. We use the following prompt to provide context to the FM:
You are an expert chef, with expertise in diverse cuisines and recipes.
I am currently a novice and I require you to write me recipes based on the ingredients provided below.
The requirements for the recipes are as follows:
– I need 3 recipes from you
– These recipes can only use ingredients listed below, and nothing else
– For each of the recipes, provide detailed step by step methods for cooking. Format it like this:
1. Step 1: <instructions>
2. Step 2: <instructions>
…
n. Step n: <instructions>
Remember, you HAVE to use ONLY the ingredients that are provided to you. DO NOT use any other ingredient.
This is crucial. For example, if you are given ingredients “Bread” and “Butter”, you can ONLY use Bread and Butter,
and no other ingredient can be added on.
An example recipe with these two can be:
Recipe 1: Fried Bread
Ingredients:
– Bread
– Butter
1. Step 1: Heat up the pan until it reaches 40 degrees
2. Step 2: Drop in a knob of butter and melt it
3. Step 3: Once butter is melted, add a piece of bread onto pan
4. Step 4: Cook until the bread is browned and crispy
5. Step 5: Repeat on the other side
6. Step 6: You can repeat this for other breads, too
The following code is the body of the Amazon Bedrock API call:
# master_ingredients_str: Labels retrieved from DynamoDB table
# prompt: Prompt shown above
content = “Here is a list of ingredients that a person currently has.” + user_ingredients_str + “nn And here are a list of ingredients at a local grocery store ” + master_ingredients_str + prompt
body = json.dumps({
“max_tokens”: 2047,
“messages”: [{“role”: “user”, “content”: content}],
“anthropic_version”: “bedrock-2023-05-31”
})
j_body = json.dumps(body)
modelId = “anthropic.claude-3-sonnet-20240229-v1:0”
response = bedrock.invoke_model(body=body, modelId=modelId)
Using the combination of the prompt and API call, we generate three recipes using the ingredients retrieved from the DynamoDB table. You can add additional parameters to body such as temperature, top_p, and top_k to further set thresholds for your prompt. For more information on getting responses from the Anthropic’s Claude 3 model using the Amazon Bedrock API, see Anthropic Claude Messages API. We recommend setting the temperature to something low (such as 0.1 or 0.2) to help ensure deterministic and structured generation of recipes. We also recommend setting the top_p value (nucleus sampling) to something high (such as 0.9) to limit the FM’s predictions to the most probable tokens (in this case, the model will consider the most probable tokens that make up 90% of the total probability mass for its next prediction). top_k is another sampling technique that limits the model’s predictions to the top_k most probable tokens. For example, if top_k = 10, the model will only consider the 10 most probable tokens for its next prediction. One of the key benefits of using Amazon Bedrock is the ability to use multiple FMs for different tasks within the same solution. In addition to generating textual recipes with Anthropic’s Claude 3, we can also dynamically generate visually appealing images to accompany those recipes. For this task, we chose to use the Stable Diffusion model available on Amazon Bedrock. Amazon Bedrock also offers other powerful image generation models such as Titan, and we’ve given you an example API call for that, too. Similar to using the Amazon Bedrock API to generate a response from Anthropic’s Claude 3, we use the following code:
modelId = “stability.stable-diffusion-xl-v0”
accept = “application/json”
contentType = “application/json”
body = json.dumps({
“text_prompts”: [
{
“text”: recipe_name
}
],
“cfg_scale”: 10,
“seed”: 20,
“steps”: 50
})
response = brt.invoke_model(
body = body,
modelId = modelId,
accept = accept,
contentType = contentType
)
For Titan, you might use something like:
modelId=”amazon.titan-image-generator-v1″,
accept=”application/json”,
contentType=”application/json”
body = json.dumps({
“taskType”: “TEXT_IMAGE”,
“textToImageParams”: {
“text”:prompt, # Required
},
“imageGenerationConfig”: {
“numberOfImages”: 1, # Range: 1 to 5
“quality”: “premium”, # Options: standard or premium
“height”: 768, # Supported height list in the docs
“width”: 1280, # Supported width list in the docs
“cfgScale”: 7.5, # Range: 1.0 (exclusive) to 10.0
“seed”: 42 # Range: 0 to 214783647
}
})
response = brt.invoke_model(
body = body,
modelId = modelId,
accept = accept,
contentType = contentType
)
This returns a base64 encoded string that you need to decode in your frontend so that you can display it. For more information about other parameters that you can include in your API call, see Stability.ai Diffusion 1.0 text to image, and Using Amazon Bedrock to generate images with Titan Image Generator models. In the following sections, you walk through the steps to deploy the solution in your AWS account.
Prerequisites
You need an AWS account to deploy this solution. If you don’t have an existing account, you can sign up for one. The instructions in this post use the us-east-1 AWS Region. Make sure you deploy your resources in a Region with AWS Machine Learning services available. For the Lambda functions to run successfully, Lambda requires an AWS Identity and Access Management (IAM) role and policy with the appropriate permissions. Complete the necessary steps from Defining Lambda function permissions with an execution role to create and attach a Lambda execution role for the Lambda functions to access all necessary actions for DynamoDB, Amazon Rekognition, and Amazon Bedrock.
Create the Lambda function to detect ingredients
Complete the following steps to create your first Lambda function (DetectIngredients):
On the Lambda console, choose Functions in the navigation pane.
Choose Create Lambda function.
Choose Author from scratch.
Name your function DetectIngredients, select Python 3.12 for Runtime, and choose Create function.
For your Lambda configuration, choose lambdaDynamoRole for Execution role, increase Timeout to 8 seconds, verify the settings, and choose Save.
Replace the text in the Lambda function code with the following sample code and choose Save:
import json
import boto3
import inference
import time
s3 = boto3.client(‘s3’)
dynamodb = boto3.resource(‘dynamodb’)
table = dynamodb.Table(‘TestDataTable’)
table_name = ‘TestDataTable’
def lambda_handler(event, context):
clearTable()
test = inference.main()
labels, label_count = inference.main()
# The names array will contain a list of all the grocery ingredients detected
# in the image
names = []
for label_dic in labels:
name = label_dic[‘Name’]
# Getting rid of unnecessary parts of label string
if “Food” in name:
# Remove “Food” from name
name = name.replace(“Food”, “”)
if “In Fridge” in name:
# Remove “In Fridge” from name
name = name.replace(“In Fridge”, “”)
name = name.strip()
names.append(name)
# Loop through the list of grocery ingredients to construct a dictionary called
# items
# the items dict will be used to batch write up to 25 items at a time when
# batch_write_all is called
items=[]
for name in names:
if (len(items)) < 29:
items.append({
‘grocery_item’: name
})
# Remove all duplicates from array
seen = set()
unique_grocery_items = []
for item in items:
val = item[‘grocery_item’].lower().strip()
if val not in seen:
unique_grocery_items.append(item)
seen.add(val)
batch_write_all(unique_grocery_items)
table.put_item(
Item={
‘grocery_item’: “DONE”
})
def batch_write_all(items):
batch_write_requests = [{
‘PutRequest’: {
‘Item’: item
}
} for item in items]
response = dynamodb.batch_write_item(
RequestItems={
table_name:batch_write_requests
}
)
def clearTable():
response = table.scan()
with table.batch_writer() as batch:
for each in response[‘Items’]:
batch.delete_item(
Key={
‘grocery_item’: each[‘grocery_item’]
}
Create a DynamoDB table to store ingredients
Complete the following steps to create your DynamoDB table.
On the DynamoDB console, choose Tables in the navigation pane.
Choose Create table.
For Table name, enter MasterGroceryDB.
For Partition key, use grocery_item (string).
Verify that all entries on the page are accurate, leave the rest of the settings as default, and choose Create.
Wait for the table creation to complete and for your table status to change to Active before proceeding to the next step.
Create the Lambda function to call Amazon Bedrock
Complete the following steps to create another Lambda function that will call the Amazon Bedrock APIs to generate recipes:
On the Lambda console, choose Functions in the navigation pane.
Choose Create function.
Choose Author from scratch.
Name your function GenerateRecipes, choose Python 3.12 for Runtime, and choose Create function.
For your Lambda configuration, choose lambdaDynamoRole for Execution role, increase Timeout to 8 seconds, verify the settings, and choose Save.
Replace the text in the Lambda function code with the following sample code choose Save:
import json
import boto3
import re
import base64
import image_gen
dynamodb = boto3.resource(‘dynamodb’)
bedrock = boto3.client(service_name=’bedrock-runtime’)
def get_ingredients(tableName):
table = dynamodb.Table(tableName)
response = table.scan()
data = response[‘Items’]
# Support for pagination
while ‘LastEvaluatedKey’ in response:
response = table.scan(ExclusiveStartKey=response[‘LastEvaluatedKey’])
data.extend(response[‘Items’])
data = [g_i for g_i in data if g_i[‘grocery_item’] != ‘DONE’]
return data
# Converts dynamoDB grocery items into a string
def convertItemsToString(grocery_dict):
ingredients_list = []
for each in grocery_dict:
ingredients_list.append(each[‘grocery_item’])
ingredients_list_str = “, “.join(ingredients_list)
return ingredients_list_str
def read_prompt():
with open (‘Prompt.md’, ‘r’) as f:
text = f.read()
return text
# Gets the names of all the recipes generated
def get_recipe_names(response_body):
recipe_names = []
for i in range(len(response_body)):
if response_body[i] == ‘n’ and response_body[i + 1] == ‘n’ and response_body[i + 2] == ‘R’:
recipe_str = “”
while response_body[i + 2] != ‘n’:
recipe_str += response_body[i + 2]
i += 1
recipe_str = recipe_str.replace(“Recipe”, ”)
recipe_str = recipe_str.replace(“: “, ”)
recipe_str = re.sub(” d+”, “”, recipe_str)
recipe_names.append(recipe_str)
return recipe_names
def lambda_handler(event, context):
# Write the ingredients to a .md file
user_ingredients_dict = get_ingredients(‘TestDataTable’)
master_ingredients_dict = get_ingredients(‘MasterGroceryDB’)
# Get string values for ingredients in both databases
user_ingredients_str = convertItemsToString(user_ingredients_dict)
master_ingredients_str = convertItemsToString(master_ingredients_dict)
# Convert dictionary into comma seperated string arg to pass into prompt
# Read the prompt + ingredients file
prompt = read_prompt()
# Query for recipes using prompt + ingredients
content = “Here is a list of ingredients that a person currently has.” + user_ingredients_str + “nn And here are a list of ingredients at a local grocery store ” + master_ingredients_str + prompt
body = json.dumps({
“max_tokens”: 2047,
“messages”: [{“role”: “user”, “content”: content}],
“anthropic_version”: “bedrock-2023-05-31”
})
j_body = json.dumps(body)
modelId = “anthropic.claude-3-sonnet-20240229-v1:0”
response = bedrock.invoke_model(body=body, modelId=modelId)
response_body = json.loads(response.get(‘body’).read())
response_body_content = response_body.get(“content”)
response_body_completion = response_body_content[0][‘text’]
recipe_names_list = get_recipe_names(response_body_completion)
first_image_imgstr = image_gen.image_gen(recipe_names_list[0])
second_image_imgstr = image_gen.image_gen(recipe_names_list[1])
third_image_imgstr = image_gen.image_gen(recipe_names_list[2])
return response_body_completion, first_image_imgstr, second_image_imgstr, third_image_imgstr
Create an S3 bucket to store the images
Lastly, you create an S3 bucket to store the images you upload, which automatically invokes the DetectIngredients Lambda function after each upload. Complete the following steps to create the bucket and configure the Lambda function:
On the Amazon S3 console, choose Buckets in the navigation pane.
Choose Create bucket.
Enter a unique bucket name, set the desired Region to us-east-1, and choose Create bucket.
On the Lambda console, navigate to the DetectIngredients
On the Configuration tab, choose Add trigger.
Select the trigger type as S3 and choose the bucket you created.
Set Event type to All object create events and choose Add.
On the Amazon S3 console, navigate to the bucket you created.
Under Properties and Event Notifications, choose Create event notification.
Enter an event name (for example, Trigger DetectIngredients) and set the events to All object create events.
For Destination, select Lambda Function and select the DetectIngredients Lambda function.
Choose Save.
Conclusion
In this post, we explored the use of Amazon Rekognition and FMs on Amazon Bedrock with AWS services such as Lambda and DynamoDB to build a comprehensive solution that addresses food waste in the US. With the use of cutting-edge AWS services including Rekognition Custom Labels and content generation with models on Amazon Bedrock, this application provides value and proof of work for AWS generative AI capabilities.
Stay on the lookout for a follow-up to this post, where we demonstrate using the multi-modal capabilities of FMs such as Anthropic’s Claude v3.1 on Amazon Bedrock to deploy this entire solution end-to-end.
Although we highlighted a food waste use case in this post, we urge you to apply your own use case to this solution. The flexibility of this architecture allows you to adapt these services to multiple scenarios, enabling you to solve a wide range of challenges.
Special thanks to Tommy Xie and Arnav Verma for their contributions to the blog.
About the Authors
Aman Shanbhag is an Associate Specialist Solutions Architect on the ML Frameworks team at Amazon Web Services, where he helps customers and partners with deploying ML training and inference solutions at scale. Before joining AWS, Aman graduated from Rice University with degrees in Computer Science, Mathematics, and Entrepreneurship.
Michael Lue is a Sr. Solution Architect at AWS Canada based out of Toronto. He works with Canadian enterprise customers to accelerate their business through optimization, innovation, and modernization. He is particularly passionate and curious about disruptive technologies like containers and AI/ML. In his spare time, he coaches and plays tennis and enjoys hanging at the beach with his French Bulldog, Marleé.
Vineet Kachhawaha is a Solutions Architect at AWS with expertise in machine learning. He is responsible for helping customers architect scalable, secure, and cost-effective workloads on AWS.