Audio Sound (with Metadata Filters)
Audio Sound (with Metadata Filters)
Audio Sound (with Metadata Filters)
Triggers an asynchronous search for video segments containing the specified audio_class (e.g., ‘Music’, ‘Dog’, ‘Explosion’) within the specified dataset (dataset_id), restricted to videos whose metadata matches every entry in metadata_filters (e.g., genre = documentary). Audio chunks are scored using the Audio Spectrogram Transformer (AST) model, which predicts confidence scores for 527 AudioSet sound classes across fixed-length audio segments extracted from ingested videos; no text embedding is computed. Returns the Databricks run_id immediately; use the status and result endpoints to poll for the final ranked list of sound segments (each with the parent video, time boundaries, the nearest keyframe, score, and optional moderation score).
Bearer authentication of the form Bearer <token>, where token is your auth token.
AudioSet sound class label to search for (e.g. AudioClass.DOG, AudioClass.MUSIC)
When true, moderation scoring is skipped and moderation_score will be null on all results.