MODELS

60+ models. One picker.

Every model we support, refreshed as new releases ship from the underlying providers.
Applies to web, iOS and Android

Chat & reasoning

OpenAI — GPT-5, GPT-4.x family.
Anthropic — Claude Opus 4.7, Sonnet 4.6, Haiku 4.5.
Google — Gemini 3 Pro, Flash.
Meta — Llama 4 family.
Mistral — Large, Medium.
Cohere — Command R+.

Image

Flux (Pro / Dev / Schnell), SDXL, DALL·E, Ideogram.
Image-to-image, ControlNet, style presets, magic prompt.

Video

Luma Dream Machine, Runway Gen-3, Google Veo, Kling.
Text-to-video and image-to-video, up to 10 seconds at 1080p.

Audio & voice

ElevenLabs (TTS, voice cloning), Suno (music), Whisper (transcription).

The live picker

Inside the studio, the model picker shows real-time pricing, latency, and capability tags for every supported model. Sign in to see the current list — it changes weekly as providers ship updates.

Per-model deep dives

Dedicated landing pages with pricing, capabilities, code examples and FAQs for the most-asked-about models:
GPT-5 on Elevence AI — OpenAI's frontier model, 400K context
Claude Opus & Sonnet — Anthropic's family, 200K context, best code generation
Gemini 3 Pro & Flash — Google's frontier, 1M context, native multimodal
Llama 4 — Meta's open-weight model, strong reasoning at low cost
FLUX — Black Forest Labs image generation, photorealistic outputs
DALL·E 3 — OpenAI's image model, best text rendering in images
Runway Gen-3 — Cinematic text-to-video with motion control
Luma Dream Machine — Cost-effective text-to-video at speed
ElevenLabs — State-of-the-art TTS, voice cloning, music generation