Metadata Generation for Videos

In this tutorial we will cover metadata generation for videos.

The goal is to generate a table with one row for each video and one column for each tag indicating if that tag is present or not in the video. Coactive already provides the scores for different dynamic tags at the keyframe level in the table so generating metadata is basically an aggregation of those scores at the video level.

It is possible to create metadata from Dynamic Tags and Concepts.

Metadata Generation using Dynamic Tags

The SQL query below generates metadata for videos based on visual dynamic tags scores. It follows the rule to tag every video with K or more keyframes with dynamic tag scores above the percentile P.

If one wants to do the same exercise using audio dynamic tags scores, the same query can be used. The only necessary adaptation is replacing the tables category_<dynamic_tag_name>_visual for the tables category_<dynamic_tag_name>_audio.

1WITH
2-- Step 1
3dt_count AS (
4 SELECT
5 dynamic_tag,
6 COUNT(*) AS dt_count
7 FROM category_<dynamic_tag_name>_visual
8 GROUP BY dynamic_tag
9),
10
11-- Step 2
12percentile_table AS (
13 SELECT
14 a.coactive_image_id,
15 a.dynamic_tag,
16 a.dynamic_tag_score,
17 ROW_NUMBER() OVER (PARTITION BY a.dynamic_tag ORDER BY a.dynamic_tag_score DESC)/b.dt_count AS percentile
18FROM category_<dynamic_tag_name>_visual AS a
19LEFT JOIN dt_count AS b
20 ON a.dynamic_tag = b.dynamic_tag
21),
22
23-- Step 3
24hit_table AS (
25 SELECT *,
26 1 AS hit
27FROM percentile_table
28-- Change here the percentile P of keyframes to be selected
29WHERE percentile < P
30),
31
32-- Step 4
33join_table AS (
34 SELECT
35 a.*,
36 COALESCE(b.hit, 0) AS hit
37FROM category_<dynamic_tag_name>_visual AS a
38LEFT JOIN hit_table AS b
39 ON a.coactive_image_id = b.coactive_image_id
40 AND a.dynamic_tag = b.dynamic_tag
41)
42
43-- Step 5
44SELECT
45 coactive_video_id,
46 dynamic_tag,
47-- Change here the requeired number of keyframes K above the treshold to tag a video
48 CASE WHEN SUM(hit) >= K THEN 1 ELSE 0 END AS num_hits
49FROM join_table
50GROUP BY
51 coactive_video_id,
52 dynamic_tag

Explanation

  1. CTE (dt_count): Count the total number of key frames for each dynamic tag.
  2. CTE (percentile_table): Normalize the dynamic tags scores so that they are between 0 and 1 using the scores distribution for each tag.
  3. CTE (hit_table): Check which keyframes are above a desired threshold P (between 0 and 1). A P of 0.01 means that only the top 1% keyframes of a dynamic tag will be tagged.
  4. CTE (join_table): Join the table with the tags back to the original keyframe table.
  5. Final Query: Aggregate the table at the video level tagging all the videos with K or more tagged keyframes.

Metadata Generation using Concepts

The same exercise can also be done with concepts following the SQL query below. It follows the rule to tag videos with K or more keyframes with the concept score above a threshold T.

1WITH
2
3video_level_tab AS (
4SELECT
5 COACTIVE_VIDEO_ID,
6 SUM(CASE WHEN <concept_name>_prob > T THEN 1 ELSE 0 END) AS NUM_LABELED_KFS,
7FROM
8 coactive_table_adv
9GROUP BY
10 COACTIVE_VIDEO_ID
11ORDER BY 2 DESC;
12)
13
14SELECT
15 *,
16 CASE WHEN NUM_LABELED_KFS > K THEN 1 ELSE 0 END AS <concept>_tag
17FROM
18 video_level_tab
19GROUP BY
20 COACTIVE_VIDEO_ID
21ORDER BY 2 DESC;

Explanation

Define a threshold T to tag every keyframe with a concept score above it. The interpretation is that we are tagging every keyframe with a probability T or bigger of belonging to the concept;

  1. CTE (video_level_tab): count how many keyframes were tagged in a video.
  2. Final Query: Tag the videos with K or more tagged keyframes