AI Models
Gemini Image Analysis
AI image analysis via Gemini. Object detection, segmentation, captioning, classification, and visual Q&A.
POST
$0.02/call/v1/media/queueOverview
Image understanding via Gemini. Object detection, segmentation, captioning, classification, and visual Q&A.
| Property | Value |
|---|---|
| Model ID | google/gemini-image-analysis |
| Context Window | 1,000,000 tokens |
| Input Price | $0.20 / 1M tokens |
| Output Price | $0.80 / 1M tokens |
| Token per Image | ~258 tokens (at 384px or smaller) |
Usage
All media models use the async job queue. Submit a job, then poll for the result.
Step 1: Submit Job
const res = await fetch('https://api.yepapi.com/v1/media/queue', {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'google/gemini-image-analysis',
prompt: 'What objects are in this image? Return bounding boxes.',
imageData: {
mimeType: 'image/png',
base64: '<base64-encoded-image>',
},
}),
});
const { data } = await res.json();
// data.jobId — use this to poll for resultscurl -X POST https://api.yepapi.com/v1/media/queue \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "google/gemini-image-analysis", "prompt": "What objects are in this image?", "imageData": {"mimeType": "image/png", "base64": "<base64>"}}'Step 2: Poll for Result
const status = await fetch(`https://api.yepapi.com/v1/media/status/${data.jobId}`, {
headers: { 'x-api-key': 'YOUR_API_KEY' },
});
const { data: job } = await status.json();
// job.status — "pending" | "processing" | "completed" | "failed"
// job.result.text — analysis text when completedcurl https://api.yepapi.com/v1/media/status/JOB_ID \
-H "x-api-key: YOUR_API_KEY"Request Body
| Parameter | Type | Required | Description | Default |
|---|---|---|---|---|
model | string | Yes | google/gemini-image-analysis | — |
prompt | string | Yes | Analysis instruction or question | — |
imageData.mimeType | string | Yes | MIME type of the image (e.g. image/png) | — |
imageData.base64 | string | Yes | Base64-encoded image data | — |
Token Consumption
- 258 tokens per image at 384px or smaller
- Larger images tiled into 768x768 sections (258 tokens each)
- Up to 3,600 images per request
- Max 10MB inline request size
Supported Formats
PNG, JPEG, WEBP, HEIC/HEIF.
Under the Hood
Powered by Google's Gemini API directly.