Qwen3 235B VL Instruct
Multi-modal reasoning with text, images, and video snippets powered by Qwen3 235B Vision-Language intelligence
🔌 API Access
Integrate Qwen3 235B VL Instruct into your multi-modal workflows.
🔑 API Keys
Include an active API key with every request. Manage your API keys →
/api/v1/generate; /api/v1/chat/completions
Chat with Qwen3 235B VL
Send text, image, or short video snippets (as extracted frames) inside the messages array. The studio automatically captures representative frames when you upload video clips.
Request (cURL)
Request (Python)
Request (JavaScript/Node.js)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | ✅ Yes | Use qwen3-235b-vl (default) |
messages |
array | ✅ Yes | Conversation turns where each content item can include {"type":"text"} or {"type":"image_url"} entries (video frames can be supplied the same way). |
temperature |
number | Optional | Controls creativity (0–2). Default 0.7. |
max_tokens |
number | Optional | Maximum response tokens (1–8192, default 2048). |
stream |
boolean | Optional | Default false. Set to true for SSE streaming responses. |
attachments |
array | Optional | When using the studio UI, the backend stores S3 references here. Direct API calls can inline HTTPS/base64 media instead. |
/api/v1/completions
Completions API (Raw Prompt)
Use the completions endpoint when you need full control over the prompt format. This is useful for custom templating or when working with special vision tokens like <|vision_start|><|image_pad|><|vision_end|>.
Request (cURL)
Request (Python)
Request (JavaScript/Node.js)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | ✅ Yes | Use qwen3-235b-vl |
prompt |
string | ✅ Yes | Raw prompt string with vision tokens. Each <|vision_start|><|image_pad|><|vision_end|> corresponds to one image in the images array. |
images |
array | Optional | Array of image URLs (HTTP/HTTPS or data URIs). Must match the number of vision tokens in the prompt. |
temperature |
number | Optional | Controls creativity (0–2). Default 0.7. |
max_tokens |
number | Optional | Maximum response tokens (1–8192, default 2048). |
stop |
array or string | Optional | Stop sequences like ["<|im_end|>", "<|endoftext|>"]. |
stream |
boolean | Optional | Default false. Set to true for SSE streaming responses. |
Important Notes
- The number of
<|vision_start|><|image_pad|><|vision_end|>tokens must match the length of theimagesarray. - Images can be provided as HTTPS URLs or base64 data URIs.
- Duplicate image URLs are allowed if you want to reference the same image multiple times.
- Set
"stream": trueto receive Server-Sent Events (SSE) for real-time token streaming.
Streaming Response Format
When stream: true, responses are sent as Server-Sent Events (SSE):
data: {"id":"abc123","object":"text_completion","created":1234567890,"model":"qwen3-235b-vl","choices":[{"index":0,"text":" white","logprobs":null,"finish_reason":null,"matched_stop":null}],"usage":null}
data: {"id":"abc123","object":"text_completion","created":1234567890,"model":"qwen3-235b-vl","choices":[{"index":0,"text":" dragon","logprobs":null,"finish_reason":null,"matched_stop":null}],"usage":null}
data: {"id":"abc123","object":"text_completion","created":1234567890,"model":"qwen3-235b-vl","choices":[{"index":0,"text":"","logprobs":null,"finish_reason":"stop","matched_stop":151645}],"usage":null}
data: [DONE]
Each data: line contains a JSON object with incremental text in choices[0].text. The stream ends with data: [DONE].