APEX
Back to models

Claude Opus 4.6

Anthropic

200K context$15.00/M input$75.00/M output
1780peak 1781

Avg Score

84.7

Avg Cost

$1.12

Score/$

75.3

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
3019
frontendeasy
2607
frontendexpert
2372
code-reviewhard
2329
refactoringexpert
2298
from-scratcheasy
2290
from-scratchmedium
2251
frontendhard
2217
backendeasy
2185
from-scratchexpert
2182
from-scratchhard
2149
code-reviewmedium
2047
refactoringmedium
2036
code-review
1992
from-scratch
1970
refactoring
1956
backendhard
1863
frontend
1840
backendexpert
1826
frontendmaster
1808
full-stackmedium
1793
backendmaster
1788
multi-languagehard
1787
backend
1773
multi-language
1769
frontendmedium
1744
debugginghard
1734
backendmedium
1697
full-stack
1689
debuggingmedium
1682
debugging
1643
debuggingexpert
1640
full-stackhard
1635

All Results

TaskCategoryScore
Migrate Express monolith to modular architecturebackend90.5
Fix and extend Chrome browser extensionfrontend83.0
Build 3D browser game with physics and multiplayer syncfrontend85.8
Build interactive data visualization dashboardfrontend69.6
Build multi-tool LLM agent runtimebackend85.1
Fix broken GitHub Actions CI pipelinedebugging98.0
Add Redis caching layer to Express APIbackend77.3
Add Google OAuth2 login to Express appfull-stack70.6
Fix Node.js stream backpressure causing OOM on large filesbackend94.7
Implement Stripe webhook handlerbackend89.0
Migrate callback-hell Express app to async/awaitrefactoring91.4
Port Python CLI to Rustmulti-language84.5
Build materialized view refresh pipeline for analyticsbackend90.5
Add retry logic and dead letter queue to Python task queuebackend77.0
Code review: identify security vulnscode-review90.2
Add WebSocket real-time updatesfull-stack86.0
Build RAG pipeline with vector searchbackend80.3
Write tests for untested legacy Flask servicecode-review92.0
Build LLM evaluation harness with structured gradingbackend78.7
Build terminal UI dashboardfrom-scratch74.9
Add streaming SSE endpoint for LLM chatbackend81.7
Add file upload with S3 presigned URLsbackend69.0
Split 1100-line god file into proper modulesrefactoring88.7
Implement zero-trust API authentication layerbackend83.4
Implement multi-tenant row-level security in Postgresbackend82.6
Harden insecure Docker setup with 12 vulnerabilitiescode-review91.5
Add i18n with locale routing to Next.js appfull-stack77.0
Convert React app to PWA with offline supportfrontend83.5
Add caching layer to eliminate slow SSR page loadsfull-stack90.7
Find and patch all OWASP Top 10 vulnerabilitiesdebugging90.4
Implement JWT auth middlewarebackend88.0
Dockerize Node.js monorepofull-stack79.8
Write Kubernetes manifests for Node.js microservicefull-stack88.6
Replace console.log with structured loggingrefactoring90.0
Optimize bloated React bundle under 500KBfrontend91.2
Remove AI slop and over-engineering from codebaserefactoring92.0
Build codebase indexer for LLM context windowsfrom-scratch86.4
Fix broken responsive layoutfrontend89.8
Zero-downtime schema migrationfull-stack79.1
Fix flaky test suitedebugging78.7
Write integration tests for payment flowcode-review85.2
Fix React hydration mismatchfrontend83.1
Add rate limiting middlewarebackend84.1
Fix deadlocking transaction patterns in Flask appbackend86.1
Implement transformer inference engine with KV cachefrom-scratch88.3
Build real-time portfolio risk calculatorbackend74.4
Build SaaS admin dashboard from scratchfrom-scratch81.7
Build production website with auth and members areafrontend79.3
Debug and fix 6 broken database triggers and constraintsdebugging80.6
Fix data integrity bugs in denormalized e-commerce schemadebugging81.6
Fix race conditions in order matching enginebackend87.0
Optimize slow Postgres queries in Flask appbackend91.1
Add slash commands and moderation to Discord botbackend84.6
Refactor monolithic handler to CQRSrefactoring78.7
Write complex SQL report with window functionsbackend86.9
Fix 12 WCAG accessibility violations in checkout formfrontend90.9
Add virtual scrolling to table rendering 5000 rowsfrontend75.2
Implement background job scheduler with persistencebackend76.2
Fix auth bypass vulnerabilitydebugging93.7
Build CLI tool with subcommands and configfrom-scratch81.5
Build distributed node cluster with gossip protocolfrom-scratch81.6
Fix memory leak in event handlerdebugging80.5
Add GraphQL layer over REST APImulti-language78.1
Find and fix 4 hidden backdoors in Flask appdebugging93.7
Build MCP server for database managementbackend82.7
Fix hallucination and context window bugs in RAG agentbackend82.2
Add cursor-based pagination to REST APIbackend87.3
Fix N+1 query in dashboardbackend92.8
Debug race condition in worker pooldebugging94.3
Build REST API from scratchfrom-scratch91.8