Text to Hybrid
Authentication
Bearer authentication of the form Bearer <token>, where token is your auth token.
Bearer authentication of the form Bearer <token>, where token is your auth token.
Executes a hybrid search that takes a natural language text query (e.g., ‘red car in parking lot’ or ‘John Smith speaking at podium’) and searches within a specified dataset (dataset_id) to find image assets ranked by a combination of visual similarity and metadata text similarity. The text query is encoded and searched across multiple modalities: (1) visual content using the dataset’s configured encoder (e.g., CLIP, SigLIP, or other vision-language embedding models) to match image appearance, and (2) textual metadata using sentence transformer embeddings to match captions, descriptions, and other text fields. Results are ranked using weighted fusion of scores from each modality and can be filtered by datetime ranges. Returns image assets with relevance scores and optional content moderation scores.