Video contains a wealth of conversational potential, but extracting meaningful insights through natural language queries requires sophisticated understanding of both spoken content and visual context. While basic transcription gives you raw text, chat completions with Rich Transcript Collections allow you to have intelligent conversations with your video content.

With Cloudglue’s Chat Completion API and Rich Transcript Collections, you can ask natural language questions about your videos and receive contextually accurate responses grounded in the actual video content—including speech, visual scenes, and on-screen text.

Chat completions work with both locally uploaded videos and YouTube content, with multimodal understanding available for uploaded files.

Understanding Rich Transcript Collections

Rich Transcript Collections are specialized collections that combine multiple layers of video understanding:

  1. Speech transcription: What’s being said in the video
  2. Visual scene descriptions: What’s happening visually
  3. On-screen text: Text and captions visible in the video
  4. Contextual understanding: How all these elements work together

Unlike basic transcription, Rich Transcript Collections create a searchable knowledge base that enables semantic queries across all modalities of your video content.

When to Use Chat Completions

Chat completions with Rich Transcript Collections are ideal when you need to:

  1. Answer specific questions about video content without watching the entire video
  2. Extract insights that span multiple segments or videos
  3. Generate summaries with specific focus areas or perspectives
  4. Find precise information using natural language rather than keyword search
  5. Build conversational interfaces that can discuss video content intelligently

Core Chat Completion Parameters

Essential Parameters

const response = await client.chat.completions.create({
  messages: [
    {
      role: 'user',
      content: 'What ingredients are needed for the pasta recipe?',
    },
  ],
  model: 'nimbus-001',
  collections: [collection.id],
  force_search: true,
  include_citations: true,
  max_tokens: 1000,
  temperature: 0.3,
});

Messages Array

The messages array follows the standard chat completion format:

  • role: “user”, “assistant”, or “system”
  • content: The message content

For multi-turn conversations, include the full conversation history:

const messages = [
  { role: 'user', content: "What's the main pasta dish in the video?" },
  {
    role: 'assistant',
    content: 'The main dish is a traditional Italian carbonara...',
  },
  { role: 'user', content: 'What makes this carbonara authentic?' },
];

Model: nimbus-001

Cloudglue’s nimbus-001 is a specialized model optimized for:

  • Multimodal understanding: Processes speech, visual, and text content together
  • Grounded responses: Answers are based on actual video content, not training data
  • Citation support: Can provide specific timestamps and sources
  • Conversational context: Maintains context across multiple exchanges

The force_search parameter controls whether the system searches your collections:

  • true: Always searches collections before responding (recommended)
  • false: May respond from general knowledge without searching
// Always search collections first
force_search: true;

Include Citations

Citations provide transparency and verifiability:

  • true: Returns timestamps, file references, and content snippets
  • false: Returns only the response text

Advanced Search Filters

For precise control over what content is searched, you can use metadata filters to target specific videos in your collection:

const response = await client.chat.completions.create({
  messages: [{ role: 'user', content: 'Show me pasta preparation techniques' }],
  model: 'nimbus-001',
  collections: [collection.id],
  force_search: true,
  filter: {
    metadata: [
      {
        path: 'metadata.cuisine_type',
        operator: 'Equal',
        valueText: 'Italian',
      },
      {
        path: 'metadata.difficulty_level',
        operator: 'In',
        valueText: 'beginner,intermediate',
      },
    ],
  },
});

Supported Filter Operations

The filter parameter allows you to constrain searches using file metadata:

  • path: JSON path to the metadata field (e.g., "metadata.custom_field" or "video_info.has_audio")

  • operator: Comparison operator to apply

    • Equal / NotEqual: Exact match comparison
    • LessThan / GreaterThan: Numeric comparison
    • In: Check if value is in a comma-separated list
    • ContainsAny / ContainsAll: Array operations (use valueTextArray)
  • valueText: Single value for scalar comparisons

  • valueTextArray: Array of values for array operations

Example metadata filter scenarios:

// Filter by custom video categories
filter: {
  metadata: [
    {
      path: 'metadata.video_category',
      operator: 'ContainsAny',
      valueTextArray: ['recipe', 'technique', 'basics'],
    },
  ];
}

// Filter by video duration
filter: {
  metadata: [
    {
      path: 'video_info.duration_seconds',
      operator: 'LessThan',
      valueText: '600', // Videos under 10 minutes
    },
  ];
}

Practical Example: Cooking Video Chat Bot

Let’s build a comprehensive example using cooking videos to demonstrate multi-turn conversations and advanced features.

Setting Up the Collection

import { CloudGlue } from '@aviaryhq/cloudglue-js';

const client = new CloudGlue();

// Create a rich transcript collection for cooking videos
const collection = await client.collections.createCollection({
  name: 'Italian Cooking Masterclass',
  collection_type: 'rich-transcripts',
  description: 'Collection of Italian cooking videos for recipe chat',
  transcribe_config: {
    enable_summary: true,
    enable_speech: true,
    enable_scene_text: true,
    enable_visual_scene_description: true,
  },
});

// Add cooking videos to the collection
const cookingVideos = [
  'https://www.youtube.com/watch?v=PASTA_VIDEO_1',
  'https://www.youtube.com/watch?v=PASTA_VIDEO_2',
  'https://www.youtube.com/watch?v=PASTA_VIDEO_3',
];

for (const videoUrl of cookingVideos) {
  const result = await client.collections.addYouTubeVideo(
    collection.id,
    videoUrl,
  );
  await client.collections.waitForReady(collection.id, result.file_id);
  console.log(`Added and processed: ${videoUrl}`);
}

Multi-Turn Conversation Example

Now let’s demonstrate a realistic conversation about pasta recipes. This example shows how to build a stateful chatbot that maintains conversation history and can handle follow-up questions. The implementation demonstrates proper conversation state management, error handling, and how to structure questions for optimal results from the nimbus-001 model.

import { type ChatCompletionResponse, CloudGlue } from '@aviaryhq/cloudglue-js';
import * as dotenv from 'dotenv';

// Load environment variables from .env file
dotenv.config();

if (process.argv.length < 4) {
  console.error('Usage: ts-node chat-completion.ts <collection-id> <query>');
  process.exit(1);
}

const collectionId = process.argv[2];
const userQuery = process.argv[3];

class CookingChatBot {
  private client: CloudGlue;
  private collectionId: string;
  private conversationHistory: {
    role: 'user' | 'system' | 'assistant';
    content: string;
    name?: string | undefined;
  }[] = [];

  constructor(client: CloudGlue, collectionId: string) {
    this.client = client;
    this.collectionId = collectionId;
    this.conversationHistory = [];
  }

  async askQuestion(question: string) {
    // Add user message to conversation history
    this.conversationHistory.push({
      role: 'user',
      content: question,
    });

    // Get response from Cloudglue
    const response = await this.client.chat.createCompletion({
      messages: this.conversationHistory,
      model: 'nimbus-001',
      collections: [this.collectionId],
      force_search: true,
      include_citations: true,
      temperature: 0.4,
    });

    const assistantMessage = response.choices?.[0]?.message;

    // Add assistant response to conversation history
    this.conversationHistory.push({
      role: 'assistant',
      content: assistantMessage?.content || '',
    });

    return {
      response: assistantMessage?.content || '',
      citations: response.choices?.[0]?.citations || [],
    };
  }

  getConversationHistory() {
    return this.conversationHistory;
  }
}

// Example conversation
async function runCookingChat(chatBot: CookingChatBot) {
  console.log('=== Question 1 ===');
  const q1 = await chatBot.askQuestion(
    'What cooking techniques are used for pasta dishes?',
  );
  console.log('Response:', q1.response);
  console.log('Citations:', q1.citations.length, 'sources');

  console.log('\n=== Question 2 ===');
  const q2 = await chatBot.askQuestion(
    'What ingredients are needed for the main recipe?',
  );
  console.log('Response:', q2.response);

  console.log('\n=== Question 3 ===');
  const q3 = await chatBot.askQuestion(`
      "Show me when the chef is plating the food
    `);
  console.log('Response:', q3.response);

  console.log('\n=== Question 4 ===');
  const q4 = await chatBot.askQuestion(`
      Can you provide a complete recipe for one of the pasta dishes in markdown format? 
      Include ingredients, equipment, and step-by-step instructions with approximate timing.
    `);
  console.log('Response:', q4.response);
}

async function main() {
  // Initialize the CloudGlue client
  const client = new CloudGlue({
    apiKey: process.env.CLOUDGLUE_API_KEY,
  });

  // Initialize the chat bot
  const chatBot = new CookingChatBot(client, collectionId);
  console.log('Chat bot initialized');

  console.log('Running cooking chat example queries...');
  await runCookingChat(chatBot);

  console.log('\n\nRunning cooking chat with user query...');
  const userResponse = await chatBot.askQuestion(userQuery);
  console.log('User query:', userQuery);
  console.log('User response:', userResponse.response);
  console.log('Citations:', userResponse.citations.length, 'sources');

  console.log('\n\nConversation TL;DR:');
  const userResponse2 = await chatBot.askQuestion(
    'Summarize the conversation in a few sentences. Include the user query and the response.',
  );
  console.log('User response:', userResponse2.response);
}

main().catch(console.error);

Key features demonstrated in this example:

  • Conversation State Management: The chatbot maintains conversation history across multiple questions
  • TypeScript Integration: Full type safety with proper CloudGlue SDK types
  • Multiple Question Types: Shows different query patterns (techniques, ingredients, visual cues, structured output)
  • Citation Handling: Demonstrates how to access and count citation sources
  • Environment Configuration: Proper setup with environment variables and command-line arguments
  • Error Handling: Safe access to response properties with optional chaining
  • Production-Ready Structure: Modular design that can be extended for real applications

Example Conversation Output

Here’s what a realistic conversation might look like:

npm run cooking-chat-bot <your collection id> "Yo what kinds of appliances / kitchenware is used in the video?"

> cloudglue-js-examples@1.0.0 cooking-chat-bot
> dotenv -e .env -- ts-node cooking-chat-bot.ts <your collection id> Yo what kinds of appliances / kitchenware is used in the video?

Chat bot initialized
Running cooking chat example queries...
=== Question 1 ===
Response: The video segments provided showcase various cooking techniques used for pasta dishes, specifically for a traditional Italian ragu.

The initial steps involve preparing the meat, as seen in the segments from 00:20 to 01:00. The chef starts by cutting 500 grams of topside beef and 200 grams of sausages into smaller pieces, seasoning them with salt and pepper. This is followed by browning the meat in a pan, which is an essential step in developing the flavor of the ragu.

The next steps involve cooking the onions and adding bay leaves to the pan, as shown in the segment from 00:40 to 01:00. The chef emphasizes the importance of mixing the ingredients well and breaking the bay leaves to release their flavor.

The segments from 01:00 to 02:00 demonstrate the technique of sealing the meat properly and cooking the onions until they dissolve. The chef also explains the difference between sweating and burning, highlighting the importance of cooking the ingredients slowly to avoid burning.

The addition of wine to the pan is another crucial step, as seen in the segment from 01:20 to 01:40. The chef advises against using cooking wine and instead recommends using a good quality wine that one would drink. The wine is added to the pan and allowed to evaporate, which helps to intensify the flavors.

The segments from 02:40 to 03:40 show the addition of tomatoes and basil to the ragu. The chef dissolves the tomatoes in wine and then adds water to the pan, followed by stirring in some basil. This is an important step in creating a rich and flavorful sauce.

The final steps involve slowly cooking the tomato sauce for about two hours, as shown in the segment from 03:20 to 03:40. The chef then prepares the pasta dish by adding the cooked pasta to the ragu and finishing it with a drizzle of Parmesan cheese and olive oil.

Overall, the cooking techniques used for this pasta dish include browning the meat, cooking the onions, adding aromatics like bay leaves and basil, using wine to intensify the flavors, and slowly cooking the tomato sauce. These techniques come together to create a rich and flavorful ragu that is perfect for serving with pasta.
Citations: 7 sources

=== Question 2 ===
Response: The video segments provided showcase the preparation of a traditional Italian ragu, and the ingredients needed for the main recipe are mentioned throughout the clips.

From the segment [00:20 - 00:40], we can see that the recipe requires 500 grams of topside beef, which can be substituted with brisket. Additionally, 500 grams of spare ribs and 200 grams of sausages are used. The speaker also mentions the importance of seasoning with salt and pepper.

In the segment [00:40 - 01:00], two bay leaves are added to the mixture, and the speaker emphasizes the need to break the leaves before adding them.

The segment [01:00 - 01:20] shows the preparation of onions, with the speaker mentioning the use of one onion. The onions will eventually dissolve and add to the flavor of the dish.

The segment [01:20 - 01:40] highlights the addition of wine to the recipe. The speaker stresses the importance of using a good-quality wine, rather than a cooking wine, which is referred to as "colored water."

The segment [01:40 - 02:00] shows the meat and onions being cooked, with the speaker explaining the process of evaporating the wine and allowing the ingredients to sweat, rather than burn.

The segment [02:40 - 03:00] mentions the addition of tomatoes, which are dissolved in wine and then filled with water.

The segment [03:00 - 03:20] shows the addition of basil, which is stirred into the mixture, and the speaker comments on the lovely aroma.

The segment [03:20 - 03:40] explains that the tomato sauce should be cooked slowly for about two hours.

Finally, the segments [04:00 - 04:40] and [04:40 - 05:00] show the preparation of the pasta dish, with the addition of Parmesan cheese, olive oil, and the final tasting of the dish.

In summary, the ingredients needed for the main recipe are:

- 500 grams of topside beef (or brisket)
- 500 grams of spare ribs
- 200 grams of sausages
- Salt and pepper
- 2 bay leaves
- 1 onion
- Wine (good-quality, not cooking wine)
- Tomatoes
- Water
- Basil
- Parmesan cheese
- Olive oil
- Pasta (not specified in the video segments, but implied as the base of the dish)

These ingredients come together to create a rich and flavorful Italian ragu, which is then served with pasta.

=== Question 3 ===
Response: The video segments provided show the preparation of a traditional Italian ragu by Chef Gennaro Contaldo. To answer your query, "Show me when the chef is plating the food," I found a relevant segment.

The chef is plating the food in the segment [04:00 - 04:20] (video_id: 5d47c409-7c0d-45c4-88e3-da1e51f0e7b6, segment_id: 63d29eba-df60-4c49-a7d8-93db717decb2). In this segment, the chef says, "Look at that. As soon as I put the pasta inside, I start the dancing." Although the chef doesn't explicitly mention plating, this segment shows the chef preparing the pasta dish, which implies that the plating process is about to begin or is already underway.

However, a more specific moment related to plating can be seen in the segment [04:40 - 05:00] (video_id: 5d47c409-7c0d-45c4-88e3-da1e51f0e7b6, segment_id: d821c770-3151-4247-8458-ffe9e884386b), where the chef tastes the pasta and says, "Do you want some? Bellissimo." This suggests that the dish is now ready to be served, and the chef is presenting it to someone, which can be considered as part of the plating process.

While the video segments don't provide a clear shot of the chef plating the food in the classical sense, these two segments show the chef preparing and presenting the dish, which are closely related to the plating process.

=== Question 4 ===
Response: ### Family Ragu with Gennaro Contaldo Recipe
#### Ingredients:
* 500g topside of beef (or brisket)
* 500g spare ribs
* 200g sausages
* Salt
* Pepper
* 2 bay leaves
* 1 onion
* Wine (use a good quality wine, not cooking wine)
* Tomatoes
* Water
* Fresh basil
* Parmesan cheese
* Olive oil
* Pasta (not specified in the video, but typically used with ragu)

#### Equipment:
* Large pot or Dutch oven
* Cutting board
* Knife
* Spoon
* Colander (for draining pasta)

#### Step-by-Step Instructions:
1. **Prepare the meat** (00:20 - 00:40): Cut the beef, spare ribs, and sausages into desired sizes. Season with salt and pepper.
2. **Sear the meat** (00:40 - 01:00): Heat some oil in the pot and sear the meat until browned on all sides.
3. **Add onion and bay leaves** (01:00 - 01:20): Add the chopped onion and 2 bay leaves to the pot. Break the bay leaves to release their flavor.
4. **Add wine** (01:20 - 01:40): Pour in the wine, making sure to use a good quality wine. Let it simmer and evaporate.
5. **Add tomatoes and water** (02:40 - 03:00): Add the tomatoes and some water to the pot. Stir to combine.
6. **Let it cook** (03:20 - 03:40): Let the sauce cook slowly for about 2 hours, stirring occasionally.
7. **Prepare the pasta** (04:00 - 04:20): Cook the pasta according to the package instructions. Drain and set aside.
8. **Combine pasta and ragu** (04:20 - 04:40): Add the cooked pasta to the pot with the ragu sauce. Toss to combine.
9. **Serve** (04:40 - 05:00): Serve the pasta with ragu, topped with Parmesan cheese and a drizzle of olive oil.

#### Approximate Timing:
* Preparation: 30 minutes
* Cooking: 2 hours
* Total: 2 hours 30 minutes

Note: The video segments provided do not include exact measurements for some ingredients, such as the amount of onion or tomatoes. The recipe above is based on the information provided in the video segments and may need to be adjusted according to personal preference.


Running cooking chat with user query...
User query: Yo what kinds of appliances / kitchenware is used in the video?
User response: The video segments provided showcase the preparation of a traditional Italian ragu by Chef Gennaro Contaldo. To answer your query, "Yo what kinds of appliances / kitchenware is used in the video?", I found several relevant segments.

Throughout the video, Chef Contaldo uses various kitchenware and appliances to prepare the dish. At [00:20 - 00:40], he is shown handling a large cutting board and a knife to chop the meat. He also uses a large pot to cook the ragu, which is visible in several segments, including [01:00 - 01:20] and [02:00 - 02:20].

In [01:20 - 01:40], Chef Contaldo is shown using a wooden spoon to stir the meat and onions, and in [02:40 - 03:00], he uses a can opener to open a can of tomatoes. A colander is also used to drain the pasta, as seen in [04:00 - 04:20].

Additionally, Chef Contaldo uses a stovetop to cook the ragu, which is visible throughout the video. He also uses a large serving spoon to serve the pasta, as seen in [04:20 - 04:40].

Some of the specific kitchenware and appliances used in the video include:

* Large cutting board
* Knife
* Large pot
* Wooden spoon
* Can opener
* Colander
* Stovetop
* Serving spoon

These kitchenware and appliances are used to prepare and cook the traditional Italian ragu, and are essential to the dish's preparation.
Citations: 4 sources


Conversation TL;DR:
User response: The conversation started with a query about cooking techniques used for pasta dishes. The user then asked about the ingredients needed for the main recipe, and later inquired about the appliances and kitchenware used in the video. The user also asked to summarize the conversation in a few sentences, including the user query and the response.

To answer the query about cooking techniques, the video segments provided show Chef Gennaro Contaldo preparing a traditional Italian ragu. The techniques used include browning the meat, sweating the onions, and slowly cooking the tomato sauce. The chef also emphasizes the importance of using high-quality ingredients, such as good wine and fresh basil.

Regarding the ingredients, the video segments mention the use of topside of beef, spare ribs, sausages, onions, carrots, celery, bay leaves, tomatoes, and basil. The chef also uses olive oil, salt, and pepper to season the dish.

As for the appliances and kitchenware, the video segments show the chef using a large pot, a cutting board, a knife, and a wooden spoon. The chef also uses a stove to cook the ragu.

To summarize, the conversation covered various aspects of cooking a traditional Italian ragu, including cooking techniques, ingredients, and appliances. The video segments provided a detailed and informative guide to preparing this classic dish.

Advanced Techniques

Using System Messages for Specialized Responses

You can guide the model’s behavior using system messages:

const messages = [
  {
    role: 'system',
    content: `You are a professional chef's assistant specializing in Italian cuisine. 
    Always provide specific timestamps when referencing techniques, include metric measurements, 
    and explain the culinary science behind each method. Format recipes in clear markdown structure.`,
  },
  {
    role: 'user',
    content: 'How should I prepare the guanciale for carbonara?',
  },
];

Search Optimization with Filters

For complex queries, use metadata filters to target specific videos in your collection:

// Search only Italian recipes with specific difficulty levels
const italianRecipesResponse = await client.chat.completions.create({
  messages: [{ role: 'user', content: 'Show me pasta preparation techniques' }],
  model: 'nimbus-001',
  collections: [collection.id],
  force_search: true,
  filter: {
    metadata: [
      {
        path: 'metadata.cuisine_type',
        operator: 'Equal',
        valueText: 'Italian',
      },
      {
        path: 'metadata.difficulty',
        operator: 'In',
        valueText: 'beginner,intermediate',
      },
    ],
  },
});

// Search videos by chef or instructor
const specificChefResponse = await client.chat.completions.create({
  messages: [{ role: 'user', content: 'What techniques does this chef use?' }],
  model: 'nimbus-001',
  collections: [collection.id],
  force_search: true,
  filter: {
    metadata: [
      {
        path: 'metadata.chef_name',
        operator: 'ContainsAny',
        valueTextArray: ['Gordon Ramsay', 'Julia Child', 'Anthony Bourdain'],
      },
    ],
  },
});

Understanding Citations

When you set include_citations: true, the response includes detailed references to the specific video segments that informed the answer. This provides transparency and allows users to verify information or explore the original content.

Example Citation Response

Let’s examine what a real citation response looks like for the question “What cooking techniques are used for pasta dishes?”:

Complete JSON Response with Citations

{
  "id": "13c4c53d-f1b7-48d2-a6b3-b86ced3cb283",
  "object": "chat.completion",
  "created": 1748039761062,
  "model": "nimbus-001",
  "usage": {
    "prompt_tokens": 69,
    "completion_tokens": 640,
    "total_tokens": 1060
  },
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The video segments provided showcase various cooking techniques used to prepare a traditional Italian dish, specifically a family ragu. \n\nFirstly, the chef, Gennaro Contaldo, starts by heating olive oil in a pan, as seen in the segment from 00:00 to 00:20. This is a fundamental technique in many Italian recipes, where olive oil is used as a base for sautéing ingredients.\n\nThe next segment, from 00:20 to 00:40, demonstrates the technique of cutting and preparing meat. The chef cuts 500 grams of topside beef and 200 grams of sausages into smaller pieces, seasoning them with salt and pepper. This technique is essential in ensuring that the meat cooks evenly and absorbs the flavors of the dish.\n\nIn the segment from 00:40 to 01:00, the chef mixes and cooks the meat with bay leaves. The technique of breaking the bay leaves and mixing them with the meat allows the flavors to infuse evenly. The chef also emphasizes the importance of hearing the \"beautiful noise\" of the ingredients cooking together, which suggests that he values the sensory experience of cooking.\n\nThe segment from 01:20 to 01:40 shows the technique of deglazing the pan with wine. The chef adds wine to the pan, stating that one should not use cooking wine, but rather a good-quality wine that they would drink. This technique helps to loosen the browned bits from the bottom of the pan, adding depth and richness to the sauce.\n\nIn the segment from 01:40 to 02:00, the chef cooks the meat and onions, allowing the wine to evaporate and the onions to \"sweat\" rather than burn. This technique is crucial in developing the flavor of the dish, as it prevents the onions from becoming bitter and overpowering.\n\nThe chef explains the difference between \"sweating\" and \"burning\" in the segment from 02:00 to 02:20, using the analogy of jogging and sunburn to illustrate the concept. This technique is essential in cooking, as it helps to preserve the natural flavors and textures of the ingredients.\n\nThe segment from 02:40 to 03:00 demonstrates the technique of cooking with tomatoes. The chef dissolves the tomatoes in wine and then adds water to the pan, creating a rich and flavorful sauce. This technique is a staple of Italian cooking, where tomatoes are often used to add acidity and depth to sauces.\n\nFinally, the segments from 03:00 to 03:20 and 03:20 to 03:40 show the technique of slowly cooking the tomato sauce with basil. The chef stirs the sauce and allows it to simmer, creating a rich and flavorful sauce. The technique of cooking the sauce slowly over low heat helps to develop the flavors and textures of the ingredients, resulting in a delicious and authentic Italian dish.\n\nOverall, the video segments showcase a range of cooking techniques, including sautéing, deglazing, sweating, and slow cooking. These techniques are essential in creating a rich and flavorful Italian ragu, and demonstrate the importance of attention to detail and patience in cooking."
      },
      "citations": [
        {
          "collection_id": "d9c40ba5-19be-4180-817b-249927616c84",
          "file_id": "5d47c409-7c0d-45c4-88e3-da1e51f0e7b6",
          "text": "I am Genera Contaldo, and this is my family revel",
          "segment_id": "2b44d4e4-c1e2-4c37-aae0-dd30c17c1bf5",
          "start_time": 0,
          "end_time": 20,
          "visual_scene_description": [],
          "scene_text": [],
          "speech": [
            {
              "speaker": "0",
              "text": "I am Genera Contaldo, and this is my family revel.",
              "start_time": 10.16,
              "end_time": 14.32
            },
            {
              "speaker": "0",
              "text": "It's so good.",
              "start_time": 15.005,
              "end_time": 16.365
            },
            {
              "speaker": "0",
              "text": "Olive oil goes in.",
              "start_time": 17.165,
              "end_time": 18.605
            }
          ]
        },
        {
          "collection_id": "d9c40ba5-19be-4180-817b-249927616c84",
          "file_id": "5d47c409-7c0d-45c4-88e3-da1e51f0e7b6",
          "text": "500 gram topside of beef",
          "segment_id": "855594b8-145e-4ae1-a1ec-23123339f01f",
          "start_time": 20,
          "end_time": 40,
          "visual_scene_description": [],
          "scene_text": [],
          "speech": [
            {
              "speaker": "0",
              "text": "500 gram topside of beef.",
              "start_time": 20.045,
              "end_time": 22.845
            },
            {
              "speaker": "0",
              "text": "You can use brisket as well.",
              "start_time": 23.244999,
              "end_time": 24.925
            },
            {
              "speaker": "0",
              "text": "It's fantastic.",
              "start_time": 24.925,
              "end_time": 25.965
            },
            {
              "speaker": "0",
              "text": "Roughly, this is the size you want.",
              "start_time": 26.045,
              "end_time": 27.805
            },
            {
              "speaker": "0",
              "text": "I have a 500 gram of spare ribs, 200 grams of sausages.",
              "start_time": 28.26,
              "end_time": 33.54
            },
            {
              "speaker": "0",
              "text": "Just cut them in half.",
              "start_time": 33.54,
              "end_time": 35.06
            },
            {
              "speaker": "0",
              "text": "Little salt.",
              "start_time": 35.06,
              "end_time": 36.26
            },
            {
              "speaker": "0",
              "text": "Pepper.",
              "start_time": 37.54,
              "end_time": 38.26
            }
          ]
        }
      ]
    }
  ]
}

Key Citation Fields

Each citation provides detailed information about the source:

  • collection_id: ID of the collection containing the video
  • file_id: Unique identifier for the specific video file
  • segment_id: ID of the specific segment within the video
  • start_time / end_time: Precise timestamps in seconds
  • text: Brief description of the segment’s relevance
  • speech: Array of transcribed speech with speaker identification and timestamps
  • visual_scene_description: Visual content descriptions (when available)
  • scene_text: On-screen text detected in the segment

Using Citation Data

Citations enable powerful functionality in your applications:

// Process citations to create clickable timestamps
function createTimestampLinks(citations) {
  return citations.map((citation) => ({
    fileId: citation.file_id,
    startTime: citation.start_time,
    endTime: citation.end_time,
    description: citation.text,
    url: `https://your-video-player.com/watch?v=${citation.file_id}&t=${citation.start_time}`,
  }));
}

// Extract specific quotes with timestamps
function extractQuotes(citations) {
  const quotes = [];
  citations.forEach((citation) => {
    citation.speech?.forEach((speechItem) => {
      quotes.push({
        text: speechItem.text,
        speaker: speechItem.speaker,
        timestamp: speechItem.start_time,
        fileId: citation.file_id,
      });
    });
  });
  return quotes;
}

This citation system ensures that every response can be traced back to its original source, providing transparency and enabling users to explore the full context of the information.

Best Practices

1. Design Effective Conversations

  • Start broad, then narrow: Begin with overview questions, then drill into specifics
  • Maintain context: Include relevant conversation history for follow-up questions
  • Use specific terminology: Culinary terms, technique names, and ingredient specifics yield better results

2. Optimize Search Parameters

  • Use force_search: true for accuracy when you need video-specific information
  • Apply metadata filters when you need to target specific subsets of your video collection
  • Set appropriate temperature: Lower (0.1-0.3) for factual responses, higher (0.5-0.7) for creative interpretations

3. Structure Complex Requests

// Instead of: "Tell me about pasta"
// Use this structured approach:
const structuredQuery = `
Analyze the pasta preparation techniques shown in these videos:
1. What are the key steps for each recipe?
2. Which techniques are emphasized by multiple chefs?
3. What timing considerations are mentioned?

Please organize your response with clear headings and include specific timestamps.
`;

4. Handle Multi-Video Collections

  • Reference specific videos when asking comparative questions
  • Use metadata filters to focus on relevant subsets of your collection
  • Aggregate insights by asking for patterns across multiple sources

5. Leverage Citation Information

  • Verify key facts by checking citation timestamps
  • Cross-reference information across multiple cited segments
  • Guide users to specific video moments for deeper learning

Try it out

Ready to start building conversational video experiences? Check out our Chat Completion API to get started with Rich Transcript Collections.

Experiment with different question types:

  • Factual queries: “What ingredients are used?”
  • Technique analysis: “How do the chefs differ in their approach?”
  • Structured requests: “Create a recipe in markdown format”
  • Comparative questions: “Which method is more traditional?”

Get started on our platform and create your first Rich Transcript Collection today.

Advanced Integration Patterns

For production applications, consider implementing:

  • Conversation persistence: Store and resume chat sessions
  • Response caching: Cache common queries for better performance
  • Citation indexing: Build searchable citation databases for content discovery
  • Multi-collection queries: Search across different video collections simultaneously
  • Response validation: Implement confidence scoring based on citation quality