The Power of Visual Data | Coactive AI

Concepts are a powerful way to categorize, analyze, and extract insights from your visual data. By leveraging SQL queries, you can efficiently explore, filter, and structure your data based on key concepts detected in images and videos.

By the end of this tutorial, you’ll be able to query and interpret concept data in Coactive, enabling you to turn unstructured visual content into actionable insights.

Retrieve all Instances of a Concept

Retrieve all rows from the coactive_table where a specific concept, such as sports, has been identified.

1 -- Retrieve rows where the 'sports' concept takes place 
2 SELECT
3   *
4 FROM
5   coactive_table
6 WHERE
7   sports = 1;

Explaination

Retrieve all rows from the coactive_table where the sports column is set to 1. Since the sports column acts as a binary flag (0 or 1) to indicate whether sports are occurring in the corresponding video or keyframe, this query will return all instances where something related to sports has been identified in the dataset.

Find the minimum and maximum probability for a concept

Identify the range of probabilities for a given concept, which helps to determine potential thresholds.

1 -- Find the minimum and maximum probabilities for the 'sports' concept
2 SELECT
3   MIN(sports_prob) AS min_probability,
4   MAX(sports_prob) AS max_probability
5 FROM
6   coactive_table_adv;

Explanation

This query gives you a range of the min and max values of a concept’s probabilities. MIN retrieves the lowest probability, while MAX retrieves the highest. From there, you can incorporate threshold probabilities.

Filter keyframes by concept probability threshold

Filter keyframes where a concept’s probability exceeds a specific threshold, sorted by probability.

1 -- Retrieve keyframes with a 'sports_prob' above a threshold
2 SELECT
3   sports_prob,
4   keyframe_time_ms,
5   coactive_video_id,
6   coactive_image_id
7 FROM
8   coactive_table_adv
9 WHERE
10   sports_prob > 0.5 -- Adjust the threshold as needed
11 ORDER BY
12   sports_prob DESC;

Explanation

This query will display the threshold results in descending order of highest to lowest threshold with a threshold cutoff at 0.5 probability for the sports concept.

Thresholds help refine results by filtering out low-confidence matches. Choosing an appropriate threshold ensures more accurate outcomes, as different concepts have unique probability distributions. Experimenting with threshold values can optimize results for specific use cases.

Calculate concept probabilities as percentages

Calculate the probability as a percentage and display it for keyframes with a concept probability above a threshold.

1 -- Calculate probabilities as percentages for easy interpretation
2 WITH Concept AS (
3   SELECT
4     coactive_image_id,
5     sports_prob AS concept_probability
6   FROM
7     coactive_table_adv
8 )
9 SELECT
10   coactive_image_id,
11   concept_probability,
12   CONCAT(ROUND(concept_probability * 100), '%') AS concept_percentage_probability
13 FROM
14   Concept
15 WHERE
16   concept_probability > 0.50
17 ORDER BY
18   concept_probability DESC;

Explanation

The query is selecting only the keyframes with concept scores above the threshold 0.5 and then ordering the outputs in descending order.

Retrieve Transcription Text from Audio Segments

This query extracts the transcription text for audio segments from the coactive_table_audio table. Transcriptions are crucial for analyzing speech data and identifying key audio content.

1 -- Retrieve transcription text from the audio table
2 SELECT
3   coactive_audio_segment_id,
4   audio_segment_speech_to_text_transcription AS transcription,
5   audio_segment_start_time_ms,
6   audio_segment_end_time_ms
7 FROM
8   coactive_table_audio
9 WHERE
10   audio_segment_speech_to_text_transcription IS NOT NULL
11 ORDER BY
12   audio_segment_start_time_ms ASC;

Explanation

The audio_segment_speech_to_text_transcription column contains the transcription for each audio segment.

The query filters out rows without transcriptions (IS NOT NULL) and orders results by the start time for chronological analysis.