Text to Hybrid

Executes a hybrid search that takes a natural language text query (e.g., 'red car in parking lot' or 'John Smith speaking at podium') and searches within a specified dataset (dataset_id) to find image assets ranked by a combination of visual similarity and metadata text similarity. The text query is encoded and searched across multiple modalities: (1) visual content using the dataset's configured encoder (e.g., CLIP, SigLIP, or other vision-language embedding models) to match image appearance, and (2) textual metadata using sentence transformer embeddings to match captions, descriptions, and other text fields. Results are ranked using weighted fusion of scores from each modality and can be filtered by datetime ranges. Returns image assets with relevance scores and optional content moderation scores.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects an object.
dataset_idstringRequiredformat: "uuid"
The unique identifier for the dataset
text_querystringRequired
The text query to search for
limitintegerOptional1-200Defaults to 100
Max number of items to return from hybrid search
datetime_filterobject or nullOptional
Filter results to only include items with timestamps within this datetime range

Response

Successful Response
datalist of objects
The results from hybrid search

Errors

422
Unprocessable Entity Error