SubQ

The first model built on a fully sub-quadratic sparse-attention architecture, enabling 12M-token reasoning at roughly one-fifth the cost of competing LLMs. Reports 81.8% on SWE-Bench Verified and 95.0% on RULER @ 128K, with OpenAI-compatible endpoints and integrations for Claude Code, Codex, and Cursor.

Context

12000000