Qwen3 235B VL Instruct
Multi-modal reasoning with text, images, and video snippets powered by Qwen3 235B Vision-Language intelligence
Backend online
0 characters
•
Input 0.50 /M tokens • Output 2.5 /M tokens
🔌 API Access
Integrate Qwen3 235B VL Instruct into your multi-modal workflows.
🔑 API Keys
Include an active API key with every request. Manage your API keys →
POST
/api/v1/generate; /api/v1/chat/completions
Chat with Qwen3 235B VL
Send text, image, or short video snippets (as extracted frames) inside the messages array. The studio automatically captures representative frames when you upload video clips.
Cost:
0.50 credits / 1M input tokens • 2.50 credits / 1M output tokens
Request (cURL)
Request (Python)
Request (JavaScript/Node.js)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | ✅ Yes | Use qwen3-235b-vl (default) |
messages |
array | ✅ Yes | Conversation turns where each content item can include {"type":"text"} or {"type":"image_url"} entries (video frames can be supplied the same way). |
temperature |
number | Optional | Controls creativity (0–2). Default 0.7. |
max_tokens |
number | Optional | Maximum response tokens (1–8192, default 2048). |
stream |
boolean | Optional | Default false. Set to true for SSE streaming responses. |
attachments |
array | Optional | When using the studio UI, the backend stores S3 references here. Direct API calls can inline HTTPS/base64 media instead. |