YepAPI
AI Models

Gemini Video Analysis

AI video analysis via Gemini. Captioning, Q&A, summarization, and content extraction. Supports uploads up to 20GB and YouTube URLs.

POST/v1/media/queue
$0.03/call

Overview

Video understanding via Gemini. Captioning, Q&A, summarization, and content extraction from video.

PropertyValue
Model IDgoogle/gemini-video-analysis
Context Window1,000,000 tokens
Input Price$0.20 / 1M tokens
Output Price$0.80 / 1M tokens
Tokens per Second~300 (default), ~100 (low res)
Audio Tokens32 tokens/second

Usage

All media models use the async job queue. Submit a job, then poll for the result.

Step 1: Submit Job

const res = await fetch('https://api.yepapi.com/v1/media/queue', {
  method: 'POST',
  headers: {
    'x-api-key': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'google/gemini-video-analysis',
    prompt: 'Describe what happens in this video',
    imageData: {
      mimeType: 'video/mp4',
      base64: '<base64-encoded-video>',
    },
  }),
});
const { data } = await res.json();
// data.jobId — use this to poll for results
curl -X POST https://api.yepapi.com/v1/media/queue \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "google/gemini-video-analysis", "prompt": "Describe what happens in this video", "imageData": {"mimeType": "video/mp4", "base64": "<base64>"}}'

Step 2: Poll for Result

const status = await fetch(`https://api.yepapi.com/v1/media/status/${data.jobId}`, {
  headers: { 'x-api-key': 'YOUR_API_KEY' },
});
const { data: job } = await status.json();
// job.status — "pending" | "processing" | "completed" | "failed"
// job.result.text — analysis text when completed
curl https://api.yepapi.com/v1/media/status/JOB_ID \
  -H "x-api-key: YOUR_API_KEY"

Request Body

ParameterTypeRequiredDescriptionDefault
modelstringYesgoogle/gemini-video-analysis
promptstringYesAnalysis instruction or question
imageData.mimeTypestringYesMIME type (e.g. video/mp4)
imageData.base64stringYesBase64-encoded video data

Input Methods

  • Inline data: up to 10MB
  • Supported: MP4, AVI, MOV, MKV, WebM, FLV, MPEG, 3GPP
Under the Hood

Powered by Google's Gemini API directly.

On this page