A comprehensive guide to extracting data with Cloudglue
Video contains a wealth of information, but developers often need this information in a structured format that their applications can easily consume. While transcription gives you raw information from a video, entity extraction allows you to get precisely the structured data you need.
With Cloudglue’s Extract API, you can define exactly what information you want to extract and receive it in a format that’s ready for your database or application logic.
Cloudglue allows you to extract entities from both locally uploaded files and
YouTube videos.
For this example, we’ll analyze a political speech about tariffs. While the video below shows the source content, our analysis was performed using a local copy to enable full multimodal understanding:
Let’s look at a real-world example of extracting structured information from a political speech video. This example demonstrates how to combine schema definition with a clear prompt to get both video-level and segment-level information.
Extract the following structured information from C-SPAN videos:1. SPEAKER: Identify main speakers by name and title2. DISCOURSE: Determine the main topic, extract notable key phrases, identify rhetorical techniques with examples, and document stated policy positions.3. REFERENCES: Record any executive orders, legislation (with names/numbers), or agreements mentioned.4. VISUAL: Note on-screen text (chyrons), backdrop elements, types of camera shots used, and significant visual symbols.
For quick experiments or single extractions, use the direct extract endpoint:
Copy
// Define your schemaconst schema = { products: [ { name: "string", price: "string", rating: "string" } ]};// Create an extract jobconst extractJob = await client.extract.createExtract(fileUri, {schema: schema,// Optionally include a prompt to guide the extractionprompt: "Extract product details including exact prices and ratings"});// Get the resultsconst result = await client.extract.getExtract(extractJob.job_id);console.log(result.data);
Copy
// Define your schemaconst schema = { products: [ { name: "string", price: "string", rating: "string" } ]};// Create an extract jobconst extractJob = await client.extract.createExtract(fileUri, {schema: schema,// Optionally include a prompt to guide the extractionprompt: "Extract product details including exact prices and ratings"});// Get the resultsconst result = await client.extract.getExtract(extractJob.job_id);console.log(result.data);
At the moment, if you want to extract entities from a YouTube video directly,
we only support extracting from speech content. For full multimodal entity
extraction, download the video and upload it to Cloudglue.
Copy
// Extract entities from YouTube video speechconst result = await client.extract.createExtract( 'https://www.youtube.com/watch?v=VIDEO_ID', { schema: mySchema, prompt: "Extract key information from the speaker's content", },);
When working with YouTube videos, consider adjusting your schema and prompt to focus on audio-centric information. Visual elements like camera_shots, backdrop, or on-screen symbols won’t be available through direct YouTube processing. For applications requiring comprehensive visual analysis, we recommend downloading the video first and using our Files API for complete multimodal understanding.