Understanding concept occurrences in images and videos is essential for metadata generation, content categorization, and trend analysis. This tutorial will guide you through SQL queries that help you:
- Retrieve concept occurrences over time to track when a concept appears in videos.
- Identify images labeled with a specific concept and evaluate confidence scores.
- Identify videos labeled with a specific concept and count labeled frames to measure concept frequency.
By the end of this tutorial, you’ll be able to extract structured insights from concept-based metadata using SQL.
Retrieve Concept Occurrences Over Time
This query helps pinpoint the exact times when a concept appears in a video. For example, tracking the presence of the “baton” concept in sports footage.
Explanation
- CTE (occurrences): Filters frames where the “baton” concept appears with a probability above 0.5. Group data to ensure each timestamp has a representative image.
- Final Query: Retrieves and orders occurrences by video ID and timestamp, creating a timeline of when the concept appears.

Use Cases
- Track specific moments when an object or action appears in a video.
- Provides a chronological view of occurrences within each video.
- Applications: Supports video editing, metadata creation, and storytelling based on specific concepts.
Identify Images Labeled with a Specific Concept
This query retrieves images that can be labeled with a specific concept, along with their confidence scores.
Explanation
- CTE (filtered_images): Filters images that have the specified concept (baton) and a confidence score (baton_prob) above a configurable threshold (e.g., 0.1).
- Final Query: Retrieves relevant images and sorts them by confidence score.

Use Cases
- Identify high-confidence labels for images.
- Improve metadata tagging and searchability for datasets.
Identify Videos Labeled with a Specific Concept and Count Labeled Frames
This query identifies which videos contain a specific concept and how many frames within each video are labeled with that concept.
Explanation
- CTE (occurrences): Aggregates concept occurrences within each video, focusing on frames where the probability of the “baton” concept exceeds 0.1 and records the highest and lowest confidence scores.
- Final Query: Orders results by occurrence count to identify videos where the concept appears most frequently.

Use Cases
- Automate concept-based video labeling.
- Understand concept density in long-form content.
- Improve content moderation and compliance monitoring. Knowing how many frames in a video exhibit the concept helps evaluate its significance within the video.