APEX TESTING_
Find out which AI coding models actually deliver and which are just hype.
by HauhauCS
Models Tested
68
Tasks
70
Total Runs
6274
Avg Score
69.0
Capital Spent
$6043.12
Top Models
View full leaderboard →| # | Model | ELO |
|---|---|---|
| 1 | Claude Opus 4.7 | 1904 |
| 2 | GPT 5.5 | 1867 |
| 3 | GPT 5.4 Mini | 1791 |
| 4 | Claude Opus 4.6 | 1782 |
| 5 | Claude Sonnet 4.6 | 1770 |
Recent Activity
Qwen3.6 27b [Q4_K_XL]→Port Python CLI to Rust
74.68m 9s
Qwen3.6 27b [Q4_K_XL]→Add caching layer to eliminate slow SSR page loads
80.54m 55s
Qwen3.6 27b [Q4_K_XL]→Debug and fix 6 broken database triggers and constraints
87.111m 21s
Qwen3.6 27b [Q4_K_XL]→Build terminal UI dashboard
68.211m 30s
Qwen3.6 27b [Q4_K_XL]→Fix Node.js stream backpressure causing OOM on large files
81.33m 19s