# Model Capabilities Reference for Chat UI

**Last Updated:** 2026-04-30  
**Purpose:** Quick reference for which OpenRouter models support vision (image input) vs text-only

---

## ✅ VERIFIED VISION-CAPABLE MODELS

| Model ID | Provider | Context | Cost/1M (in/out) | Vision Quality | Coding Quality |
|----------|----------|---------|------------------|----------------|----------------|
| `google/gemma-3-27b-it:free` | Google | 131K | FREE | Good | Good |
| `qwen/qwen3.5-flash-02-23` | Alibaba | 1M | $0.07/$0.26 | Very Good | Very Good |
| `qwen/qwen3-235b-a22b-2507` | Alibaba | 262K | $0.07/$0.10 | Very Good | Excellent |
| `deepseek/deepseek-v4-flash` | DeepSeek | 1M | $0.14/$0.28 | Good | Excellent |
| `google/gemini-2.0-flash-001` | Google | 1M | $0.10/$0.40 | Excellent | Excellent |
| `google/gemini-2.0-flash-lite-001` | Google | 1M | $0.07/$0.30 | Very Good | Very Good |
| `anthropic/claude-sonnet-4` | Anthropic | 1M | $3.00/$15.00 | Outstanding | Outstanding |
| `anthropic/claude-haiku-3.5` | Anthropic | 200K | $1.00/$5.00 | Excellent | Outstanding |
| `openai/gpt-4o` | OpenAI | 128K | $2.50/$10.00 | Outstanding | Outstanding |
| `google/gemini-pro-1.5` | Google | 1M | $2.00/$12.00 | Outstanding | Outstanding |

---

## ❌ TEXT-ONLY MODELS (NO VISION)

| Model ID | Provider | Context | Cost/1M (in/out) | Best For |
|----------|----------|---------|------------------|----------|
| `qwen/qwen-turbo` | Alibaba | 131K | $0.03/$0.13 | High-volume text chat |
| `qwen/qwen3.5-9b` | Alibaba | 262K | $0.10/$0.15 | Budget text applications |
| `mistralai/mistral-nemo` | Mistral | 131K | $0.02/$0.03 | Ultra-budget text |
| `mistralai/ministral-3b-2512` | Mistral | 131K | $0.10/$0.10 | Small budget text |
| `google/gemma-3-4b-it` | Google | 131K | $0.04/$0.08 | Lightweight text tasks |

**⚠️ CRITICAL WARNING:** Selecting a text-only model for image upload will result in:
```
Error: 404 - No endpoints found that support image input
```

---

## 🧪 TESTING PROCEDURE

Before deploying a UI with a new model, verify capabilities:

```bash
# Test vision capability:
curl -X POST "https://openrouter.ai/api/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen-turbo",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/test.jpg"}}
      ]
    }]
  }'

# If vision is supported: Returns analysis
# If vision NOT supported: Returns error about image input
```

---

## 📊 SESSION LESSON: QWEN-TURBO MISMATCH

**Date:** 2026-04-30  
**Issue:** User switched to `qwen/qwen-turbo` (great budget choice!) but then couldn't use vision features.

**Error Received:**
```
Error: 404 - No endpoints found that support image input
```

**Root Cause:** Current model (`qwen-turbo`) does NOT support vision, but the UI was configured to send images.

**Solution:** 
1. Added warning in skill documentation
2. Created this reference document
3. Recommend models at top of vision-capable list

**Recommended Fix for Users:**
```bash
# For budget vision:
hermes config set model.default qwen/qwen3.5-flash-02-23

# Or for free vision testing:
hermes config set model.default google/gemma-3-27b-it:free
```

---

## 🎯 RECOMMENDATIONS BY USE CASE

### For Custom Chat UI Deployment

| Need | Recommended Model | Why |
|------|-------------------|-----|
| **Budget + Vision** | `qwen/qwen3.5-flash-02-23` | 1M context, excellent value, strong vision |
| **Coding + Vision** | `deepseek/deepseek-v4-flash` | Best coding performance at low cost |
| **Production App** | `google/gemini-2.0-flash-001` | Balanced performance, Google's reliable infrastructure |
| **Premium Quality** | `anthropic/claude-sonnet-4` | Best overall capabilities if budget allows |
| **Free Testing** | `google/gemma-3-27b-it:free` | No cost, decent capabilities |

### Avoid for Chat UI

- `qwen/qwen-turbo` - Unless you're 100% sure users won't upload images
- Any model marked "text-only" above
- Models with < 100K context (limits conversation length)

---

## 🔗 RELATED FILES

- `remote-server-deployment.md` - Deploying chat UI to cloud servers
- `cloud-hosting-firewall.md` - Hosting provider firewall configurations

---

## SOURCE

Model capabilities verified against OpenRouter API (2026-04-30):
```bash
curl -s "https://openrouter.ai/api/v1/models" | jq '.data[] | select(.id | contains("qwen") or contains("gemini") or contains("claude")) | {id, name, context_length, pricing}'
```
