Skip to main content

Stay Updated with Our Newsletter

Subscribe to receive the latest developer updates, features, and other changes directly in your inbox.
April 28, 2026
v0.7

Cloudglue v0.7 is here

This release introduces the Deep Search API, a new transcript-only narrative segmentation strategy, the nimbus-002-preview model on Responses, programmatic Data Connectors access, TikTok URL support, word-level transcript timestamps, and a wave of describe and segmentation improvements.

New Features

  • Deep Search API — Run multi-step, citation-grounded research queries across a collection, a specific list of files, or your account’s default index, fanning out across speech, visual, OCR, and tag modalities to surface answers with traceable sources. Try it in the playground →
  • nimbus-002-preview Model on Responses — A new preview model for Responses with stronger long-form reasoning over your videos, available alongside the existing default model.
  • Transcript-Only Narrative Segmentation — New transcript option for narrative_config.strategy provides fast, low-cost chapter segmentation derived purely from speech transcripts. Ideal for podcasts, lectures, and dialogue-heavy content. Joins the existing comprehensive (VLM-based) and balanced (multi-modal, default) strategies. Learn more →
  • Data Connectors API — Programmatically list connected data sources and browse their files directly from the API. Manage connections from the dashboard →
  • TikTok URL Support — Process TikTok videos directly by passing the URL to any Cloudglue endpoint, alongside YouTube and other public URL sources.
  • Word-Level Transcript Timestamps — New include_word_timestamps query parameter on transcribe and describe endpoints returns a words array with per-word start_time and end_time for high-precision alignment.

Improvements

  • Subtitle & Transcript Response Formats — New response_format options for speech describe: speech_srt and speech_vtt for subtitle files, speech_markdown for diarized transcripts, and speech_text for plain timestamped text.
  • Chapters & Shots in Describe Responses — New include_chapters and include_shots query parameters return narrative chapter boundaries (when segmentation is narrative) or detected shot boundaries (when shot-detector) directly on describe and related responses.
  • Thumbnails on Describe, Extract & Segmentations — New include_thumbnails parameter adds file-level and per-segment thumbnail_url fields directly on describe, extract, and segmentation responses.
  • Fill Gaps for Shot-Based Segmentation — New fill_gaps option (default true) ensures complete timeline coverage when using the shot-detector strategy. Set to false to preserve only the raw detected shot boundaries. Learn more →
  • Expanded Speech Format Support — Broader audio/speech format coverage across upload and processing.
  • HLS Stream URLs on Shareable Assets — Shareable asset responses now include an HLS stream URL for adaptive playback in custom players.
  • Build with AI Hub — A new starting point for agent-driven development, bundling Cloudglue Skills (version-locked SDK knowledge for coding agents) and a hosted Docs MCP Server that exposes the documentation as live tools.

Breaking Changes

  • JavaScript SDK package renamed — The npm package moved from @aviaryhq/cloudglue-js to @cloudglue/cloudglue-js. Update your package.json to install from the new package name; the API surface is unchanged.

SDK Updates

All v0.7 features are available in the latest SDKs:
February 11, 2026
v0.6

Cloudglue v0.6 is here

This release introduces the Responses API, shareable assets, audio file support, transcript-only extraction, and a wave of new endpoints and improvements across the platform.

New Features

  • Responses API — OpenAI Responses-compatible interface for multi-turn conversations with video collections. Supports system instructions, temperature control, and rich citations with timestamps. Try it in the playground → Quickstart guide →
  • Shareable Assets — Create shareable and embeddable links for videos and video segments. Share media programmatically via the API or from the file manager in the web app. Learn more →
  • Audio File Support — Upload and process audio files (mp3, m4a, etc.) alongside video. Full support across describe, extract, collections, search, and shareable assets.
  • Transcript Mode for Extract — New enable_transcript_mode option for extract jobs to operate purely on transcript text, skipping full audio/video processing for faster, cheaper entity extraction.

Improvements

  • Hybrid Search — Combine multiple search modalities (general_content, speech_lexical, ocr_lexical, tag_semantic, tag_lexical) in a single query for more comprehensive results. Results are fused across modalities automatically.
  • Narrative Segmentation for On-Demand & Collections — The narrative segmentation strategy is now available for on-demand operations (describe, extract) and collections, allowing scenes to be organized by narrative chapters for search, chat, and other applications.
  • Modality Filtering for Describe — Filter describe output by specific modalities (visual, speech, OCR, audio) for targeted results.
  • List & Retrieve Chat Completions — New endpoints to list and fetch previous chat completions by ID. Get by ID →
  • Listing Operations for Jobs — New list endpoints for face detection and face match jobs with pagination and filtering support. Face match list →
  • Segment Describe Endpoints — Retrieve describe results at the individual segment level. Get segment describe →
  • Collection Media Upload — New endpoint to add media directly to collections via URL.

Breaking Changes

  • Video-level and segment-level entity extraction are now mutually exclusiveenable_video_level_entities and enable_segment_level_entities can no longer both be true in the same extract job. Segment-level extraction remains the default.

SDK Updates

All v0.6 features are available in the latest SDKs:
December 22, 2025
v0.5

Cloudglue v0.5 is here

This release introduces user-defined tags, enhanced segment metadata, keyframe thumbnails, and improved job management APIs.

New Features

  • User-Defined Tags - Create and manage custom tags to organize your video content. Tags can be applied to both files and segments, making it easy to categorize and retrieve content based on your own taxonomy. View all tag endpoints →
  • Tag-Based Search - Search your files and segments by tags in addition to semantic search. Filter results by user-defined tags at both the file and segment level. Available as a search modality in the playground. Search files by tags →, Search segments by tags →
  • Segment Metadata - Add and update custom structured metadata on individual video segments. Store additional context, annotations, or custom attributes directly on segments for richer data organization. Segment metadata is also available to filter by in search, providing rich structured querying capabilities to your searches. Get segment details →

Improvements

SDK Updates

All v0.5 features are available in the latest SDKs:
Nov 26, 2025
v0.4

Cloudglue v0.4 is here

We’re excited to share the latest major release with powerful new capabilities for face analysis, enhanced segmentation, and improved search functionality.

New Features

  • Face Analysis Collection & Face Search - Detect and search for faces in your videos with our new face-analysis collection type. Create collections specifically for face detection, then search across your video library using face matching. Perfect for identifying speakers, tracking individuals, or finding specific people across multiple videos. Try it in our playground →
  • Audio Descriptions - Generate detailed audio descriptions for your videos with the new enable_audio_description option in describe jobs. Get comprehensive descriptions of sounds, music, and audio events alongside visual and speech analysis.
  • New Segment Options - Segment with narrative (comprehensive and balanced) or shot-based strategies, with full advanced options for fine-grained control like min/max parameters. “Narrative” is perfect for generating video chapters. “Shot-based” is perfect for capturing transitional shots.
    • When paired with a search collection for search use cases, you can additionally retrieve moments aligned to your segment strategy.
  • Multi-Modal Search - Search your video content across more modalities: file-level summaries, segment-level content, and now face-based image matching. Enhanced with powerful programmability features including score thresholding to filter results, group by file for organized results, and flexible sort options (by relevance score or item count) for better control over search results.
  • Data Connectors: Google Drive, Zoom, Recall.ai & Gong - Connect your Google Drive, Zoom, Recall.ai, and Gong accounts directly to Cloudglue to process your files and meeting recordings. Skip manual file uploads and seamlessly integrate your cloud storage and recorded calls with all Cloudglue features including transcription, extraction, and search. Learn about Google Drive → Learn about Zoom → Learn about Recall.ai → Learn about Gong →

Improvements

  • Time-Based Filtering for Describe Output - Filter describe job results by specific time ranges using start_time_seconds and end_time_seconds parameters. Perfect for analyzing specific portions of longer videos without processing the entire file.
  • Pagination for Extract Jobs - Extract jobs now support pagination with limit and offset parameters, making it easier to retrieve large numbers of extracted entities in manageable chunks.
  • Filter Parameters for Collection Files - List collection files with powerful filtering capabilities. Filter by metadata, video properties (duration, audio presence), and file attributes (filename, size, creation date) using flexible query operators.
September 24, 2025
v0.3.0

Cloudglue v0.3.0 is here

Welcome to the first official update of the Cloudglue changelog. We’re excited to share the latest features we’ve been cooking up for you all.

New Features

  • Scene Segmentation - Break your videos into meaningful segments automatically using AI-powered shot detection or prompt driven narrative chapters. Perfect for finding natural boundaries within your videos Try it in our playground →
  • Video Search - Retrieve clips and videos directly from your collection, without asking chat completion! You can now search your video content directly using natural language queries at both video and segment levels. Find specific moments, topics, or conversations across your entire video library with semantic search. Try it in our playground →
  • Data Connectors (AWS S3, Dropbox) - Connect your AWS S3 buckets, or Dropbox account directly to Cloudglue and skip manual file uploads entirely. Use your existing S3 URIs and Dropbox files with all our endpoints. Try it in your dashboard →
  • Thumbnails - Generate visual previews for your video segments automatically. Get thumbnail images for key moments in your videos to enhance your applications and user interfaces.