Video Search

Executes a semantic video search using a natural language text query (e.g., 'person walking in park' or 'cityscape at sunset') within a specified dataset (dataset_id). The text query is encoded into an embedding using the dataset's configured encoder (e.g., Perception Encoder or other vision-language embedding models for visual modalities, or Qwen or other text encoders for audio transcript modality), then searched against video content embeddings using vector similarity. The search behavior is controlled by the modality parameter: 'video' (default) searches video-level embeddings for overall video similarity, 'shot' searches shot-level embeddings to find videos with similar scenes, 'image' searches frame-level embeddings to find videos with similar individual frames, and 'audio_speech_to_text' searches transcript text embeddings for spoken content. All modalities return one result per video, surfacing the most relevant composite slice (shot or scene) and a preview frame. Results can be filtered using optional metadata filters and include the composite slice with start/end timestamps, frame numbers, relevance scores, and video metadata.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects an object.
dataset_idstringRequiredformat: "uuid"
The unique identifier for the dataset
text_querystringRequired>=2 characters
The natural language search string
metadata_filtersobject or nullOptional
JSON string containing a list of metadata filters
offsetintegerOptionalDefaults to 0

Starting index to return (default 0)

limitintegerOptionalDefaults to 60

Max number of items to return(default 60, max 1000)

modalityenum or nullOptional

Modality of the video to search for ( video, shot, image ). Valid options: “video”, “shot”, “image”, “audio_speech_to_text”, “capped-shot-segment”. Default is ‘video’.

Allowed values:
skip_moderationbooleanOptionalDefaults to false
Skip content moderation if enabled
moderation_score_typeenumOptional
Type of moderation scores to return when moderation is enabled
Allowed values:
ignore_keyframesbooleanOptionalDefaults to false
Whether to ignore keyframes

Response

Successful Response
resultslist of objects
List of video search results

Errors

422
Unprocessable Entity Error