For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Triggers an asynchronous transcript search within the specified dataset (`dataset_id`), restricted to videos whose metadata matches every entry in `metadata_filters` (e.g., `rating = PG`, `genre = documentary`). When `match_mode="semantic"`, the `text_query` is encoded with the dataset's configured audio/transcript encoder (e.g., sentence transformers, Qwen) and ranked against transcript embeddings via vector similarity. When `match_mode="exact"`, no embedding is computed and the raw `text_query` is matched lexically against transcript text. Returns the Databricks `run_id` immediately; use the status and result endpoints to poll for the final ranked list of transcript segments (each with the parent video, audio segment, timestamp, score, coverage score, and optional moderation score).
Authentication
AuthorizationBearer
Bearer authentication of the form Bearer <token>, where token is your auth token.
Request
This endpoint expects an object.
dataset_idstringRequiredformat: "uuid"
Dataset to search within
text_querystringRequired>=2 characters
Query text matched against transcripts. In semantic mode it is encoded into an embedding and ranked via vector similarity; in exact mode it is matched lexically without embedding.
limitintegerOptional1-100Defaults to 40
Max number of results to return
offsetintegerOptional>=0Defaults to 0
Number of results to skip before returning
metadata_filterslist of objectsOptional
List of filter objects applied against asset metadata fields before scoring. All filters are combined with AND.
skip_moderationbooleanOptionalDefaults to false
When true, moderation scoring is skipped and moderation_score will be null on all results.
match_modeenumOptional
semantic: encode query with the dataset audio/transcript encoder; exact: pass text to the job without embedding for literal phrase matching.
Allowed values:
Response
Successful Response
run_idinteger
Databricks job run ID to use for status polling and result retrieval
Triggers an asynchronous transcript search within the specified dataset (dataset_id), restricted to videos whose metadata matches every entry in metadata_filters (e.g., rating = PG, genre = documentary). When match_mode="semantic", the text_query is encoded with the dataset’s configured audio/transcript encoder (e.g., sentence transformers, Qwen) and ranked against transcript embeddings via vector similarity. When match_mode="exact", no embedding is computed and the raw text_query is matched lexically against transcript text. Returns the Databricks run_id immediately; use the status and result endpoints to poll for the final ranked list of transcript segments (each with the parent video, audio segment, timestamp, score, coverage score, and optional moderation score).