Analyzing Concepts with SQL

Understanding concept occurrences in images and videos is essential for metadata generation, content categorization, and trend analysis. This tutorial will guide you through SQL queries that help you:

  • Retrieve concept occurrences over time to track when a concept appears in videos.
  • Identify images labeled with a specific concept and evaluate confidence scores.
  • Identify videos labeled with a specific concept and count labeled frames to measure concept frequency.

By the end of this tutorial, you’ll be able to extract structured insights from concept-based metadata using SQL.

Retrieve Concept Occurrences Over Time

This query helps pinpoint the exact times when a concept appears in a video. For example, tracking the presence of the “baton” concept in sports footage.

1-- Step 1: Identify occurrences of the specific concept in video keyframes
2WITH occurrences AS (
3 SELECT
4 coactive_video_id, -- Video ID
5 keyframe_time_ms, -- Timestamp in milliseconds
6 baton_prob, -- Probability score for the baton concept
7 MAX(coactive_image_id) AS coactive_image_id -- Select the best image for representation
8 FROM
9 coactive_table_adv
10 WHERE
11 baton_prob > 0.5 -- Only include keyframes with a confidence score above 0.5
12 GROUP BY
13 coactive_video_id, keyframe_time_ms, baton_prob
14)
15
16-- Step 2: Retrieve and order results for concept occurrences
17SELECT
18 o.coactive_video_id, -- Video ID
19 o.keyframe_time_ms, -- Time of occurrence
20 o.baton_prob, -- Concept probability
21 o.coactive_image_id -- Best image for this occurrence
22FROM
23 occurrences o
24ORDER BY
25 o.coactive_video_id, o.keyframe_time_ms ASC;

Explanation

  1. CTE (occurrences): Filters frames where the “baton” concept appears with a probability above 0.5. Group data to ensure each timestamp has a representative image.
  2. Final Query: Retrieves and orders occurrences by video ID and timestamp, creating a timeline of when the concept appears.

Concept occurrences over time

Use Cases

  • Track specific moments when an object or action appears in a video.
  • Provides a chronological view of occurrences within each video.
  • Applications: Supports video editing, metadata creation, and storytelling based on specific concepts.

Identify Images Labeled with a Specific Concept

This query retrieves images that can be labeled with a specific concept, along with their confidence scores.

1-- Step 1: Filter images with the specified concept and sufficient confidence
2WITH filtered_images AS (
3 SELECT
4 coactive_image_id, -- Unique identifier for the image
5 baton_prob AS baton_prob -- Confidence score for the concept
6 FROM
7 coactive_table_adv
8 WHERE
9 baton_prob > 0.1 -- Include only images with a probability above 0.1
10)
11
12-- Step 2: Retrieve the filtered images with their concept and certainty
13SELECT
14 coactive_image_id, -- Image ID
15 baton_prob -- Confidence score for the detected concept
16FROM
17 filtered_images
18ORDER BY
19 baton_prob DESC; -- Order by confidence score in descending order

Explanation

  1. CTE (filtered_images): Filters images that have the specified concept (baton) and a confidence score (baton_prob) above a configurable threshold (e.g., 0.1).
  2. Final Query: Retrieves relevant images and sorts them by confidence score.

Keyframes with concept scores ranked

Use Cases

  • Identify high-confidence labels for images.
  • Improve metadata tagging and searchability for datasets.

Identify Videos Labeled with a Specific Concept and Count Labeled Frames

This query identifies which videos contain a specific concept and how many frames within each video are labeled with that concept.

1-- Step 1: Aggregate occurrences of the concept across videos
2WITH occurrences AS (
3 SELECT
4 coactive_video_id,
5 COUNT(*) AS occurrence_count, -- Count the number of frames with the concept
6 MAX(coactive_image_id) AS coactive_image_id, -- Select a representative image for the video
7 MAX(baton_prob) AS max_probability, -- Highest probability for the concept in the video
8 MIN(baton_prob) AS min_probability -- Lowest probability for the concept in the video
9 FROM
10 coactive_table_adv
11 WHERE
12 baton_prob > 0.1 -- Only include frames with a probability above this threshold
13 GROUP BY
14 coactive_video_id -- Group by video to analyze concept occurrences within each
15)
16
17-- Step 2: Select relevant metadata for each video
18SELECT
19 o.max_probability, -- Maximum probability for the concept
20 o.min_probability, -- Minimum probability for the concept
21 o.occurrence_count, -- Number of frames with the concept
22 o.coactive_video_id, -- Video ID
23 o.coactive_image_id -- Best representative image ID for the video
24FROM
25 occurrences o
26ORDER BY
27 o.occurrence_count DESC; -- Order by the number of frames with the concept

Explanation

  • CTE (occurrences): Aggregates concept occurrences within each video, focusing on frames where the probability of the “baton” concept exceeds 0.1 and records the highest and lowest confidence scores.
  • Final Query: Orders results by occurrence count to identify videos where the concept appears most frequently.

Keyframes with concept scores ranked

Use Cases

  • Automate concept-based video labeling.
  • Understand concept density in long-form content.
  • Improve content moderation and compliance monitoring. Knowing how many frames in a video exhibit the concept helps evaluate its significance within the video.