Back to Modelsby Microsoft
M
Phi-4 Multimodal
PhiCutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.
Free EndpointMay 23, 2025mit license244K API calls / 30d
Specifications
Context Length
8K
8,192 tokens
Input Price
Free
per 1M tokens
Output Price
Free
per 1M tokens
Modalities
3
text→text, image+text→text, audio→text
Capabilities
text generationchatvisionaudioreasoning
Tags
speech-recognitionmultimodalvisionaudio
API Usage
1Router is fully OpenAI-compatible. Just set the base URL and use this model ID:
microsoft/phi-4-multimodal-instructcURL
curl https://api.1router.com/v1/chat/completions \
-H "Authorization: Bearer $ROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "microsoft/phi-4-multimodal-instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'