Multimodal embeddings at scale: AI data lake for media and entertainment workloads

This post shows you how to build a scalable multimodal video search system that enables natural language search across large video datasets using Amazon Nova models and Amazon OpenSearch Service. You will learn how to move beyond manual tagging and keyword-based searches to enable semantic search that captures the full richness of video content.

We demonstrate this at scale by processing 792,270 videos from two AWS Open Data Registry datasets: Multimedia Commons (787,479 videos, 37-second average) and MEVA (4,791 videos, 5-minute average). Processing 8,480 hours of video content (30.5M seconds) took 41 hours. First-year total cost: $27,328 (with OpenSearch on-demand) or $23,632 (with OpenSearch Service Reserved Instances). The cost consisted of one-time ingestion ($18,088) and annual Amazon OpenSearch Service ($9,240 on-demand or $5,544 Reserved).

The ingestion breakdown is as follows:

Amazon Elastic Compute Cloud (Amazon EC2) compute (4× c7i.48xlarge spot at $2.57/hour × 41 hours): $421
Amazon Bedrock Nova Multimodal Embeddings (30.5M seconds × $0.00056/second batch pricing): $17,096
Nova Pro tagging (792K videos × 600 tokens(avg.)): $571

The solution generates audio-visual embeddings using AUDIO_VIDEO_COMBINED mode (see Nova Multimodal Embeddings API schema), stores them in OpenSearch Service, and supports text-to-video, video-to-video, and hybrid search.

Solution overview

The architecture consists of two main workflows—ingestion and search—that work together to enable multimodal video search at scale:

Video ingestion pipeline:

The ingestion pipeline uses four Amazon EC2 c7i.48xlarge instances with 600 parallel workers to process 19,400 videos per hour. The async API has a concurrency limit of 30 concurrent jobs per account (see Amazon Bedrock quotas), so the pipeline implements a job queue with polling. Workers submit jobs up to the concurrency limit, poll for completion, and submit new jobs as slots become available. Amazon Nova Multimodal Embeddings handles video processing asynchronously, segmenting videos into 15-second chunks (optimized for capturing scene changes while keeping embedding counts manageable) and generating 1024-dimensional embeddings. Those embeddings were chosen over 3072-dimensional for 3x cost savings from the storage point of view with minimal accuracy impact. The embedding generation cost is agnostic to embedding dimensions. Amazon Nova Pro adds 10-15 descriptive tags per video from a predefined taxonomy.

Note: Amazon Nova 2 Lite offers improved accuracy at lower cost for tagging tasks. We recommend that you consider it for new deployments. The system stores embeddings in an OpenSearch k-NN index for semantic search and metadata tags in a separate text index for keyword matching. For search, you can query videos three ways: convert natural language to embeddings for text-to-video search, compare video embeddings directly for video-to-video search, or combine both approaches in hybrid search.

Types of searches enabled by this solution:

Text-to-video Search – Natural language queries converted to embeddings for semantic similarity matching
Video-to-video Search – Find similar content by comparing video embeddings directly
Hybrid search – Combines vector similarity (70% weight) with keyword matching (30% weight) for maximum accuracy

Video ingestion pipeline

The following diagram illustrates the video ingestion and processing pipeline:

Figure 1: Video ingestion pipeline showing the flow from S3 video storage through Nova Multimodal Embeddings and Nova Pro to dual OpenSearch indexes

The video processing workflow is as follows:

Upload videos to Amazon Simple Storage Service (Amazon S3).
Process videos using Nova Multimodal Embeddings async API, which automatically segments videos and generates embeddings. An orchestrator polls for job completion (async API has a 30 concurrent job limit per account, see Amazon Bedrock quotas) and retrieves results from Amazon S3.
Generate descriptive tags using Nova Pro (or Nova Lite for better accuracy at lower cost) from a predefined taxonomy for enhanced search capabilities.
Index embeddings in OpenSearch k-NN index and tags in text index.

Video search architecture

The following diagram shows the complete search architecture:

Figure 2: Video search architecture demonstrating three search modes – text-to-video, video-to-video, and hybrid search combining k-NN and BM25

The search architecture enables three modes:

Text-to-video – Natural language queries
Video-to-video – Similar content discovery
Hybrid – Combined semantic and keyword matching

Prerequisites

Before you begin, you will need:

An AWS account with access to Amazon Bedrock in us-east-1 (Nova models are enabled by default with appropriate IAM permissions)
Python 3.9 or later installed
AWS Command Line Interface (AWS CLI) configured with appropriate credentials
An Amazon OpenSearch Service domain (r6g.large or larger recommended)
An Amazon S3 bucket for video storage and embedding outputs
AWS Identity and Access Management (IAM) for Amazon Bedrock, OpenSearch Service, and Amazon S3

The solution uses:

Amazon Bedrock with Nova Multimodal Embeddings (amazon.nova-2-multimodal-embeddings-v1:0)
Amazon Bedrock with Nova Pro (us.amazon.nova-pro-v1:0) or Nova Lite (us.amazon.nova-2-lite-v1:0) for tagging
Amazon OpenSearch Service 2.11 or later with k-NN plugin
Amazon S3 for video and embedding storage

Walkthrough

Step 1: Create IAM roles and policies

Create an IAM role with permissions to invoke Amazon Bedrock models, write to OpenSearch indexes, and read/write S3 objects.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:StartAsyncInvoke",
        "bedrock:GetAsyncInvoke",
        "bedrock:ListAsyncInvoke"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.nova-2-multimodal-embeddings-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/us.amazon.nova-pro-v1:0"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "es:ESHttpPost",
        "es:ESHttpPut",
        "es:ESHttpGet"
      ],
      "Resource": "arn:aws:es:us-east-1:ACCOUNT_ID:domain/DOMAIN_NAME/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-video-bucket/*",
        "arn:aws:s3:::amzn-s3-demo-embedding-bucket/*"
      ]
    }
  ]
}

Step 2: Set up OpenSearch Service indexes

Create two OpenSearch Service indexes: one for vector embeddings (k-NN) and one for text metadata. This architecture supports semantic search and hybrid queries.

from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3

session = boto3.Session()
credentials = session.get_credentials()
awsauth = AWS4Auth(
    credentials.access_key,
    credentials.secret_key,
    session.region_name,
    'es',
    session_token=credentials.token
)

opensearch_client = OpenSearch(
    hosts=[{'host': 'YOUR_OPENSEARCH_ENDPOINT', 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection
)

# Create k-Nearest Neighbors (k-NN) index for embeddings
knn_index_body = {
    "settings": {
        "index.knn": True,
        "number_of_shards": 2,
        "number_of_replicas": 1
    },
    "mappings": {
        "properties": {
            "video_id": {"type": "keyword"},
            "segment_index": {"type": "integer"},
            "timestamp": {"type": "float"},
            "embedding": {
                "type": "knn_vector",
                "dimension": 1024,
                "method": {
                    "name": "hnsw",
                    "space_type": "cosinesimilarity",
                    "engine": "faiss"
                }
            },
            "s3_uri": {"type": "keyword"}
        }
    }
}

opensearch_client.indices.create(
    index="video-embeddings-knn",
    body=knn_index_body
)

# Create text index for metadata
text_index_body = {
    "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1
    },
    "mappings": {
        "properties": {
            "video_id": {"type": "keyword"},
            "segment_index": {"type": "integer"},
            "tags": {"type": "text", "analyzer": "standard"}
        }
    }
}

opensearch_client.indices.create(
    index="video-embeddings-text",
    body=text_index_body
)

Step 3: Process videos with Nova Multimodal Embeddings

The Amazon Bedrock async API processes videos and generates embeddings. It segments videos into 15-second chunks and combines audio and visual information.

import boto3
import json
import time

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

def generate_video_embeddings(video_s3_uri, output_s3_uri):
    """Generate embeddings for a video using Nova MME async API."""
    
    # Start async job
    response = bedrock.start_async_invoke(
        modelId="amazon.nova-2-multimodal-embeddings-v1:0",
        modelInput={
            "taskType": "SEGMENTED_EMBEDDING",
            "segmentedEmbeddingParams": {
                "embeddingPurpose": "GENERIC_INDEX",
                "embeddingDimension": 1024,
                "video": {
                    "format": "mp4",
                    "embeddingMode": "AUDIO_VIDEO_COMBINED",
                    "source": {"s3Location": {"uri": video_s3_uri}},
                    "segmentationConfig": {"durationSeconds": 15}
                }
            }
        },
        outputDataConfig={"s3OutputDataConfig": {"s3Uri": output_s3_uri}}
    )
    
    # Poll for completion
    invocation_arn = response["invocationArn"]
    while True:
        job = bedrock.get_async_invoke(invocationArn=invocation_arn)
        if job["status"] == "Completed":
            return read_embeddings_from_s3(job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"])
        elif job["status"] in ["Failed", "Expired"]:
            raise RuntimeError(f"Job failed: {job.get('failureMessage')}")
        time.sleep(10)

def manage_concurrent_jobs(bedrock_client, video_queue, max_concurrent=30):
    """Manage 30 concurrent async jobs within quota limits."""
    active_jobs = {}
    
    while video_queue or active_jobs:
        # Submit new jobs up to limit (uses same start_async_invoke call as above)
        while len(active_jobs) < max_concurrent and video_queue:
            video_info = video_queue.pop(0)
            response = bedrock_client.start_async_invoke(
                modelId="amazon.nova-2-multimodal-embeddings-v1:0",
                modelInput={...},  # Same model_input structure as generate_video_embeddings()
                outputDataConfig={"s3OutputDataConfig": {"s3Uri": video_info['output_uri']}}
            )
            active_jobs[response["invocationArn"]] = video_info
        
        # Poll all active jobs
        for arn in list(active_jobs.keys()):
            job = bedrock_client.get_async_invoke(invocationArn=arn)
            if job["status"] == "Completed":
                video_info = active_jobs.pop(arn)
                embeddings = read_embeddings_from_s3(job["outputDataConfig"]["s3OutputDataConfig"]["s3Uri"])
                # Process embeddings...
            elif job["status"] in ["Failed", "Expired"]:
                active_jobs.pop(arn)
        
        if active_jobs:
            time.sleep(10)

def read_embeddings_from_s3(s3_uri):
    """Read JSONL embeddings from S3. Returns list of {startTime, endTime, embedding} dicts."""
    # Download and parse JSONL from s3_uri (standard S3 GetObject + json.loads per line)

Step 4: Generate metadata tags with Nova Pro or Nova Lite

Generate descriptive tags for videos using Nova Pro (or Nova Lite for better accuracy at lower cost) to enable hybrid search that combines semantic and keyword matching.

VALID_TAGS = [
    "person", "vehicle", "animal", "building", "nature", "indoor", "outdoor",
    "walking", "running", "sitting", "standing", "talking", "driving",
    "day", "night", "sunny", "cloudy", "urban", "rural", "beach", "forest",
    "sports", "music", "food", "technology", "crowd", "solo"
]

def generate_tags(video_s3_uri, sample_frame_count=3):
    """Generate descriptive tags using Nova Pro or Nova Lite."""
    
    prompt = f"""Analyze this video and select 10-15 tags from this predefined list that best describe the content:
{', '.join(VALID_TAGS)}

Only return tags from this list as a comma-separated list. Do not invent new tags."""
    
    response = bedrock.converse(
        modelId="us.amazon.nova-pro-v1:0",  # Or use us.amazon.nova-2-lite-v1:0
        messages=[{
            "role": "user",
            "content": [{
                "video": {
                    "format": "mp4",
                    "source": {"s3Location": {"uri": video_s3_uri}}
                }
            }, {
                "text": prompt
            }]
        }]
    )
    
    # Parse tags from response and validate against taxonomy
    tags_text = response['output']['message']['content'][0]['text']
    tags = [tag.strip().lower() for tag in tags_text.split(',')]
    
    # Filter to only valid tags from our taxonomy
    valid_tags = [tag for tag in tags if tag in VALID_TAGS]
    
    return valid_tags

Step 5: Index embeddings and tags in OpenSearch Service

Store the generated embeddings and tags in OpenSearch Service using bulk indexing for efficiency.

from opensearchpy import helpers

def index_video_data(video_id, s3_uri, embeddings, tags):
    """Index embeddings and tags in OpenSearch."""
    
    # Prepare bulk actions for k-NN index
    knn_actions = []
    for idx, emb in enumerate(embeddings):
        doc_id = f"{video_id}_{idx}"
        knn_actions.append({
            "_index": "video-embeddings-knn",
            "_id": doc_id,
            "_source": {
                "video_id": video_id,
                "segment_index": idx,
                "timestamp": emb['start_time'],
                "embedding": emb['embedding'],
                "s3_uri": s3_uri
            }
        })
    
    # Bulk index embeddings
    helpers.bulk(opensearch_client, knn_actions)
    
    # Prepare bulk actions for text index
    text_actions = []
    for idx in range(len(embeddings)):
        doc_id = f"{video_id}_{idx}"
        text_actions.append({
            "_index": "video-embeddings-text",
            "_id": doc_id,
            "_source": {
                "video_id": video_id,
                "segment_index": idx,
                "tags": " ".join(tags)
            }
        })
    
    # Bulk index tags
    helpers.bulk(opensearch_client, text_actions)
    
    print(f"Indexed {len(embeddings)} segments for video {video_id}")

Step 6: Implement search functionality

After ingestion completes, search the indexed videos three ways. The implementation targets low-latency queries.

Initialize OpenSearch Service client for search

First, create the OpenSearch Service client for search operations:

from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3

def create_opensearch_client():
    """Create OpenSearch client with AWS authentication."""
    session = boto3.Session(region_name='us-east-1')
    credentials = session.get_credentials()
    awsauth = AWS4Auth(
        credentials.access_key,
        credentials.secret_key,
        'us-east-1',
        'es',
        session_token=credentials.token
    )
    
    return OpenSearch(
        hosts=[{'host': 'YOUR_OPENSEARCH_ENDPOINT', 'port': 443}],
        http_auth=awsauth,
        use_ssl=True,
        verify_certs=True,
        connection_class=RequestsHttpConnection,
        timeout=30
    )

# Create client
opensearch_client = create_opensearch_client()

Text-to-video semantic search

Convert natural language queries to embeddings using the sync API, then perform a k-NN similarity search:

def search_text_to_video(query_text, opensearch_client, k=10):
    """Search videos using natural language query converted to embedding."""
    
    bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1')
    
    # Use SINGLE_EMBEDDING task type for text-to-embedding conversion
    # VIDEO_RETRIEVAL purpose optimizes embeddings for searching video content
    request_body = {
        "taskType": "SINGLE_EMBEDDING",
        "singleEmbeddingParams": {
            "embeddingPurpose": "VIDEO_RETRIEVAL",
            "embeddingDimension": 1024,
            "text": {
                "truncationMode": "END",
                "value": query_text
            }
        }
    }
    
    response = bedrock_client.invoke_model(
        modelId='amazon.nova-2-multimodal-embeddings-v1:0',
        body=json.dumps(request_body),
        accept='application/json',
        contentType='application/json'
    )
    
    response_body = json.loads(response['body'].read())
    # Response structure: {"embeddings": [{"embeddingType": "TEXT", "embedding": [...]}]}
    query_embedding = response_body['embeddings'][0]['embedding']
    
    # Perform k-NN search against video embeddings
    search_body = {
        "query": {
            "knn": {
                "embedding": {
                    "vector": query_embedding,
                    "k": k
                }
            }
        },
        "size": k,
        "_source": ["video_id", "segment_index", "timestamp", "s3_uri"]
    }
    
    response = opensearch_client.search(
        index="video-embeddings-knn",
        body=search_body
    )
    
    # Extract results
    return [{'score': hit['_score'], 
             'video_id': hit['_source']['video_id'],
             'segment_index': hit['_source']['segment_index'],
             'timestamp': hit['_source'].get('timestamp', 0)} 
            for hit in response['hits']['hits']]

Text search with BM25 (keyword matching)

Use the OpenSearch BM25 scoring for keyword matching on tags without generating embeddings:

def search_text_bm25(search_term, opensearch_client, k=10):
    """Search videos using BM25 keyword matching on tags field."""
    
    # Search text index using match query on tags
    search_body = {
        "query": {
            "match": {
                "tags": search_term
            }
        },
        "size": k,
        "_source": ["video_id", "segment_index", "tags"]
    }
    
    response = opensearch_client.search(
        index="video-embeddings-text",
        body=search_body
    )
    
    return response['hits']['hits']  # Extract results (same pattern as above)

Video-to-video search

Retrieve an existing video’s embedding from OpenSearch Service and search for similar content—no Amazon Bedrock API call needed:

def search_video_to_video(query_video_id, query_segment_index, opensearch_client, k=10):
    """Find similar videos using a reference video segment."""
    
    # Get the embedding from the reference video segment
    sample_query = {
        "query": {
            "bool": {
                "must": [
                    {"term": {"video_id": query_video_id}},
                    {"term": {"segment_index": query_segment_index}}
                ]
            }
        },
        "_source": ["video_id", "segment_index", "embedding"]
    }
    
    sample_response = opensearch_client.search(
        index="video-embeddings-knn",
        body=sample_query
    )
    
    if not sample_response['hits']['hits']:
        return []
    
    sample_doc = sample_response['hits']['hits'][0]['_source']
    query_embedding = sample_doc.get('embedding')
    
    # Perform k-NN search with the embedding
    search_body = {
        "query": {
            "knn": {
                "embedding": {
                    "vector": query_embedding,
                    "k": k
                }
            }
        },
        "size": k,
        "_source": ["video_id", "segment_index", "timestamp"]
    }
    
    response = opensearch_client.search(
        index="video-embeddings-knn",
        body=search_body
    )
    
    return response['hits']['hits']  # Extract results as needed

Hybrid search

Combine semantic k-NN and BM25 keyword matching by retrieving results from both indexes and merging with weighted scoring:

def search_hybrid(query_text, opensearch_client, k=10, vector_weight=0.7, text_weight=0.3):
    """Hybrid search combining k-NN semantic search and BM25 text matching."""
    
    # Generate query embedding (use same code as search_text_to_video above)
    query_embedding = generate_query_embedding(query_text)  # See text-to-video example
    
    # Get k-NN results (same query as search_text_to_video)
    knn_response = opensearch_client.search(
        index="video-embeddings-knn",
        body={"query": {"knn": {"embedding": {"vector": query_embedding, "k": 20}}}, "size": 20}
    )
    
    # Get BM25 text results (same query as search_text_bm25)
    text_response = opensearch_client.search(
        index="video-embeddings-text",
        body={"query": {"match": {"tags": query_text}}, "size": 20}
    )
    
    # Combine results with weighted scoring
    knn_hits = knn_response['hits']['hits']
    text_hits = text_response['hits']['hits']
    
    combined = {}
    
    for hit in knn_hits:
        vid = hit['_source']['video_id']
        seg = hit['_source']['segment_index']
        key = f"{vid}_{seg}"
        combined[key] = {
            'video_id': vid,
            'segment_index': seg,
            'tags': hit['_source'].get('tags', ''),
            'vector_score': hit['_score'],
            'text_score': 0,
            'combined_score': hit['_score'] * vector_weight
        }
    
    for hit in text_hits:
        vid = hit['_source']['video_id']
        seg = hit['_source']['segment_index']
        key = f"{vid}_{seg}"
        if key in combined:
            combined[key]['text_score'] = hit['_score']
            combined[key]['combined_score'] += hit['_score'] * text_weight
        else:
            combined[key] = {
                'video_id': vid,
                'segment_index': seg,
                'tags': hit['_source'].get('tags', ''),
                'vector_score': 0,
                'text_score': hit['_score'],
                'combined_score': hit['_score'] * text_weight
            }
    
    # Sort by combined score and return top k
    sorted_results = sorted(combined.values(), key=lambda x: x['combined_score'], reverse=True)[:k]
    
    return sorted_results

# Usage example - search with natural language query
query = "person walking on beach at sunset"
hybrid_results = search_hybrid(query, opensearch_client, k=10)

for r in hybrid_results:
    print(f"Combined: {r['combined_score']:.4f} (Vector: {r['vector_score']:.4f}, Text: {r['text_score']:.4f})")
    print(f"  Video: {r['video_id']}, Segment: {r['segment_index']}")
    print(f"  Tags: {r['tags']}n")

Search performance at scale

After indexing all 792,218 videos, we measured search performance across all three methods.

The measured query latencies at 792,218 videos are as follows:

Semantic k-NN search: ~76ms (using HNSW logarithmic scaling)
BM25 text search: ~30ms
Hybrid search: ~106ms

After indexing and storing all 792,218 videos and generating embeddings, the storage requirements are as follows:

k-NN index: 28.8 GB for 792K videos
Text index: 1.0 GB for 792K videos
Total: 29.8 GB (manageable on modern OpenSearch clusters)

The Hierarchical Navigable Small World (HNSW) algorithm used for k-NN search provides logarithmic time complexity, which means search times grow slowly as the dataset increases. All three search methods maintain sub-200 ms response times even at 792K video scale, meeting production requirements for interactive search applications.

Things to know

Performance and cost considerations

Video processing time depends on video length. In our testing, a 45-second video took approximately 70 seconds to process using the async API. The processing includes automatic segmentation, embedding generation for each segment, and output to Amazon S3. Search operations scale efficiently—our testing shows that even at 792K videos, semantic search completes in under 80 ms, text search in under 30 ms, and hybrid search in under 11 0ms.Use 1024-dimensional embeddings instead of 3072 to reduce storage costs while maintaining accuracy. Nova Multimodal Embeddings charges per second of video input ($0.00056/second batch), so video duration—not embedding dimension or segmentation—determines processing cost. The async API is more cost-effective than processing frames individually. For OpenSearch Service, using r6g instances provides better price-performance than earlier instance types, and you can implement tiering to move cold data to Amazon S3 for additional savings.

Scaling to production

For production deployments with large video libraries, consider using AWS Batch to process videos in parallel across multiple compute instances. You can partition your video dataset and assign subsets to different workers. Monitor OpenSearch Service cluster health and scale data nodes as your index grows. The two-index architecture scales well because k-NN and text searches can be optimized independently.

Search accuracy tuning

Tune hybrid search weights based on your use case. The default 0.7/0.3 split (vector/text) favors semantic similarity for most scenarios. If you have high-quality metadata tags, increasing the text weight to 0.5 can improve results. We recommend that you test different configurations with your specific content to find a balance.

Cleanup

To avoid ongoing charges, delete the resources that you created:

Delete the OpenSearch Service domain from the Amazon OpenSearch Service console
Empty and delete the S3 buckets used for videos and embeddings
Delete any IAM roles created specifically for this solution

Note that Amazon Bedrock charges are based on usage, so no cleanup is needed for the Amazon Bedrock models themselves.

Conclusion

This walkthrough covered building a multimodal video search system for natural language queries across video content. The solution uses Amazon Bedrock Nova models to generate embeddings. These embeddings capture both audio and visual information, stores them efficiently in OpenSearch Service using a two-index architecture, and provides three search modes for different use cases.The async processing approach scales to handle large video libraries, and the hybrid search capability combines semantic and keyword-based matching for maximum accuracy. You can extend this foundation by adding features like video-to-video similarity search, implementing caching for frequently searched queries, or integrating with AWS Batch for parallel processing of large datasets.

To learn more about the technologies used in this solution, see Amazon Nova Multimodal Embeddings and Hybrid Search with Amazon OpenSearch Service.

Multimodal embeddings at scale: AI data lake for media and entertainment workloads

Solution overview

Video ingestion pipeline

Video search architecture

Prerequisites

Walkthrough

Step 1: Create IAM roles and policies

Step 2: Set up OpenSearch Service indexes

Step 3: Process videos with Nova Multimodal Embeddings

Step 4: Generate metadata tags with Nova Pro or Nova Lite

Step 5: Index embeddings and tags in OpenSearch Service

Step 6: Implement search functionality

Initialize OpenSearch Service client for search

Text-to-video semantic search

Text search with BM25 (keyword matching)

Video-to-video search

Hybrid search

Search performance at scale

Things to know

Performance and cost considerations

Scaling to production

Search accuracy tuning

Cleanup

Conclusion

About the authors

Leave a Comment Cancel Reply

Solution overview

Video ingestion pipeline

Video search architecture

Prerequisites

Walkthrough

Step 1: Create IAM roles and policies

Step 2: Set up OpenSearch Service indexes

Step 3: Process videos with Nova Multimodal Embeddings

Step 4: Generate metadata tags with Nova Pro or Nova Lite

Step 5: Index embeddings and tags in OpenSearch Service

Step 6: Implement search functionality

Initialize OpenSearch Service client for search

Text-to-video semantic search

Text search with BM25 (keyword matching)

Video-to-video search

Hybrid search

Search performance at scale

Things to know

Performance and cost considerations

Scaling to production

Search accuracy tuning

Cleanup

Conclusion

About the authors

Related Posts

Leave a Comment Cancel Reply

Sign In

Register

Reset Password