Back to Models
M

Phi-4 Multimodal

Phi
by Microsoft

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

Free EndpointMay 23, 2025mit license244K API calls / 30d

Specifications

Context Length
8K
8,192 tokens
Input Price
Free
per 1M tokens
Output Price
Free
per 1M tokens
Modalities
3
text→text, image+text→text, audio→text

Capabilities

text generationchatvisionaudioreasoning

Tags

speech-recognitionmultimodalvisionaudio

API Usage

1Router is fully OpenAI-compatible. Just set the base URL and use this model ID:

microsoft/phi-4-multimodal-instruct
cURL
curl https://api.1router.com/v1/chat/completions \
  -H "Authorization: Bearer $ROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/phi-4-multimodal-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'