Segment Video
Create intelligent segments for video or audio files based on shot detection or narrative analysis.
Audio File Support:
- Audio files support narrative criteria only (shot detection is not available for audio).
- Audio files default to the ‘balanced’ strategy and may opt into ‘transcript’.
Note: YouTube URLs and audio files are supported for narrative-based segmentation only. Shot-based segmentation requires direct video file access. Use Cloudglue Files, HTTP URLs, or files from data connectors for shot-based segmentation.
Narrative Segmentation Strategies:
- comprehensive (default for non-YouTube/non-audio files): Uses a VLM to deeply analyze logical segments of video. Only available for video files (not YouTube or audio).
- balanced (default for YouTube videos and audio files): Balanced analysis approach using multiple modalities. Supports YouTube URLs and audio files.
- transcript: Cheap and fast speech-transcript-based segmentation. Requires a transcript and returns an error when none is available — use
balancedfor silent or visual-only content (orcomprehensivefor non-YouTube/non-audio video files).
YouTube URLs and Audio Files: Only the ‘balanced’ and ‘transcript’ strategies are accepted. ‘comprehensive’ will be rejected with an error.
Chapter Count Parameters:
- number_of_chapters: Target number of chapters. If only this is provided, min_chapters and max_chapters are calculated automatically.
- min_chapters: Minimum number of chapters. If provided with number_of_chapters and max, validates min is less than or equal to number_of_chapters which is less than or equal to max.
- max_chapters: Maximum number of chapters. If provided with number_of_chapters and min, validates min is less than or equal to number_of_chapters which is less than or equal to max.
- If none are provided, chapter counts are calculated automatically based on file duration.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Segmentation job parameters
Input video or audio file URL. Supports URIs of files uploaded to Cloudglue Files endpoint, public HTTP URLs, YouTube URLs (narrative criteria only), public TikTok video URLs, public Loom share URLs, and files which have been granted access to Cloudglue via data connectors.
⚠️ Important: YouTube URLs and audio files are ONLY supported for narrative-based segmentation. Shot-based segmentation requires direct video file access and does not support YouTube URLs or audio files. For YouTube URLs and audio files with narrative segmentation, only the 'balanced' and 'transcript' strategies are accepted; the 'comprehensive' strategy will be rejected with an error. Public TikTok video URLs will be scraped and stored automatically — subject to charges. For files via our Data connectors, see our documentation on data connectors for setup information.
Segmentation criteria: • shot: Detect scene changes and shot boundaries using computer vision (not supported for YouTube URLs or audio files) • narrative: Identify logical narrative segments and chapters using AI analysis (supports YouTube URLs and audio files)
shot, narrative Configuration for shot-based segmentation. Only provide when criteria is 'shot'.
Configuration for narrative-based segmentation. Only provide when criteria is 'narrative'.
Response
Successful response
Unique identifier for the segment job
ID of the file this segment belongs to
Object type, always 'segments'
segments Current status of the segment job
pending, processing, completed, failed Segment criteria used for this job
shot, narrative Unix timestamp in milliseconds when the job was created
Configuration used for shot-based segmentation (only present when criteria is 'shot')
Configuration used for narrative-based segmentation (only present when criteria is 'narrative')
Total number of segments generated (only present when status is 'completed')
x >= 0Total number of shots in the original video (only present when criteria is 'shot')
x >= 0Total number of chapters in the video (only present when criteria is 'narrative')
x >= 0Array of generated segments (only present when status is 'completed')
Array of shots in the original video (only present when criteria is 'shot')
Array of narrative chapters in the video (only present when criteria is 'narrative')