APEX
Back to models

Claude Opus 4.5

Anthropic

200K context$15.00/M input$75.00/M output
1720peak 1721

Avg Score

82.3

Avg Cost

$0.88

Score/$

93.7

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languageexpert
2859
frontendeasy
2564
multi-languagehard
2457
from-scratcheasy
2374
frontendexpert
2372
from-scratchmedium
2360
frontendhard
2275
from-scratchexpert
2194
backendeasy
2173
from-scratchhard
2079
multi-language
2076
refactoringmedium
1978
from-scratch
1965
code-reviewmedium
1893
full-stackmedium
1881
refactoring
1843
frontend
1813
frontendmedium
1760
refactoringexpert
1733
backendexpert
1725
debugginghard
1724
backendmaster
1722
code-review
1721
full-stack
1710
debuggingmedium
1690
backendhard
1688
backend
1673
full-stackhard
1626
backendmedium
1613
debugging
1604
frontendmaster
1588
debuggingexpert
1526
code-reviewhard
1411

All Results

TaskCategoryScore
Migrate Express monolith to modular architecturebackend86.3
Fix and extend Chrome browser extensionfrontend74.3
Build 3D browser game with physics and multiplayer syncfrontend77.5
Build interactive data visualization dashboardfrontend67.7
Build multi-tool LLM agent runtimebackend85.3
Port Python CLI to Rustmulti-language81.5
Write tests for untested legacy Flask servicecode-review81.8
Fix Node.js stream backpressure causing OOM on large filesbackend94.4
Fix React hydration mismatchfrontend87.6
Add Redis caching layer to Express APIbackend74.0
Add WebSocket real-time updatesfull-stack87.9
Build RAG pipeline with vector searchbackend84.3
Add retry logic and dead letter queue to Python task queuebackend80.5
Add GraphQL layer over REST APImulti-language90.4
Code review: identify security vulnscode-review83.5
Implement background job scheduler with persistencebackend76.2
Migrate callback-hell Express app to async/awaitrefactoring86.5
Implement transformer inference engine with KV cachefrom-scratch89.3
Build distributed node cluster with gossip protocolfrom-scratch78.5
Fix broken GitHub Actions CI pipelinedebugging95.0
Optimize bloated React bundle under 500KBfrontend81.4
Find and patch all OWASP Top 10 vulnerabilitiesdebugging91.5
Add streaming SSE endpoint for LLM chatbackend79.0
Add file upload with S3 presigned URLsbackend78.1
Implement JWT auth middlewarebackend86.9
Build codebase indexer for LLM context windowsfrom-scratch78.8
Replace console.log with structured loggingrefactoring92.8
Add caching layer to eliminate slow SSR page loadsfull-stack89.2
Harden insecure Docker setup with 12 vulnerabilitiescode-review95.6
Convert React app to PWA with offline supportfrontend82.3
Add i18n with locale routing to Next.js appfull-stack78.2
Implement zero-trust API authentication layerbackend82.6
Remove AI slop and over-engineering from codebaserefactoring90.5
Split 1100-line god file into proper modulesrefactoring86.2
Dockerize Node.js monorepofull-stack84.8
Implement multi-tenant row-level security in Postgresbackend78.6
Fix broken responsive layoutfrontend86.9
Write Kubernetes manifests for Node.js microservicefull-stack89.4
Refactor monolithic handler to CQRSrefactoring68.7
Fix flaky test suitedebugging82.0
Build real-time portfolio risk calculatorbackend77.5
Zero-downtime schema migrationfull-stack66.7
Fix 12 WCAG accessibility violations in checkout formfrontend91.8
Build production website with auth and members areafrontend80.2
Build SaaS admin dashboard from scratchfrom-scratch82.2
Fix hallucination and context window bugs in RAG agentbackend73.2
Build CLI tool with subcommands and configfrom-scratch81.8
Build MCP server for database managementbackend83.2
Build LLM evaluation harness with structured gradingbackend73.8
Fix race conditions in order matching enginebackend80.4
Fix data integrity bugs in denormalized e-commerce schemadebugging72.8
Build materialized view refresh pipeline for analyticsbackend75.3
Fix deadlocking transaction patterns in Flask appbackend87.9
Debug and fix 6 broken database triggers and constraintsdebugging78.3
Write complex SQL report with window functionsbackend84.0
Find and fix 4 hidden backdoors in Flask appdebugging93.1
Add Google OAuth2 login to Express appfull-stack79.0
Optimize slow Postgres queries in Flask appbackend83.2
Add slash commands and moderation to Discord botbackend82.0
Add virtual scrolling to table rendering 5000 rowsfrontend79.3
Write integration tests for payment flowcode-review69.9
Fix auth bypass vulnerabilitydebugging93.7
Add cursor-based pagination to REST APIbackend48.5
Implement Stripe webhook handlerbackend77.3
Add rate limiting middlewarebackend83.2
Build terminal UI dashboardfrom-scratch77.5
Build REST API from scratchfrom-scratch93.7
Fix N+1 query in dashboardbackend91.5
Fix memory leak in event handlerdebugging85.2
Debug race condition in worker pooldebugging91.9