Image to Keyframes
Authentication
Bearer authentication of the form Bearer <token>, where token is your auth token.
Bearer authentication of the form Bearer <token>, where token is your auth token.
Executes a visual similarity search that takes an input image (provided via public_url or upload_id) and searches within a specified dataset (dataset_id) to find visually similar frames from videos. The input image is encoded into an embedding using the dataset’s configured encoder (e.g., Perception Encoder, or other vision-language embedding models), then the embedding is searched against all video frames in the dataset using vector similarity. Returns a ranked list of video frames ordered by similarity score, where each result includes the frame ID, parent video ID, similarity score, and optional composite slice information (e.g., which shot or scene the frame belongs to).