APEX
Back to models

Claude Haiku 4.5

Anthropic

200K context$1.00/M input$5.00/M output
1522peak 1523

Avg Score

71.5

Avg Cost

$0.07

Score/$

979.2

Runs

65

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratcheasy
2275
backendeasy
2198
multi-languageexpert
2154
multi-languagehard
1926
refactoringexpert
1860
frontendhard
1806
refactoringmedium
1797
code-reviewmedium
1795
from-scratchhard
1758
refactoring
1748
full-stackmedium
1746
frontendeasy
1737
multi-language
1681
code-review
1674
debuggingmedium
1647
full-stack
1598
from-scratch
1575
debugginghard
1573
debugging
1517
full-stackhard
1506
backendhard
1485
debuggingexpert
1470
frontend
1467
frontendexpert
1467
code-reviewhard
1453
backend
1426
frontendmedium
1387
backendmedium
1382
backendexpert
1311
from-scratchmedium
1069
from-scratchexpert
481

All Results

TaskCategoryScore
Build SaaS admin dashboard from scratchfrom-scratch68.1
Add retry logic and dead letter queue to Python task queuebackend60.9
Build RAG pipeline with vector searchbackend74.3
Write tests for untested legacy Flask servicecode-review82.8
Build distributed node cluster with gossip protocolfrom-scratch53.5
Add Google OAuth2 login to Express appfull-stack66.0
Migrate callback-hell Express app to async/awaitrefactoring84.9
Build terminal UI dashboardfrom-scratch55.0
Build materialized view refresh pipeline for analyticsbackend63.9
Zero-downtime schema migrationfull-stack87.8
Add Redis caching layer to Express APIbackend82.5
Implement background job scheduler with persistencebackend48.2
Port Python CLI to Rustmulti-language69.9
Fix broken GitHub Actions CI pipelinedebugging95.0
Add rate limiting middlewarebackend84.9
Add GraphQL layer over REST APImulti-language80.2
Code review: identify security vulnscode-review79.3
Fix race conditions in order matching enginebackend87.2
Build MCP server for database managementbackend87.3
Add file upload with S3 presigned URLsbackend71.8
Replace console.log with structured loggingrefactoring74.6
Add streaming SSE endpoint for LLM chatbackend23.0
Split 1100-line god file into proper modulesrefactoring81.0
Implement multi-tenant row-level security in Postgresbackend5.0
Optimize bloated React bundle under 500KBfrontend72.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging73.7
Add i18n with locale routing to Next.js appfull-stack75.0
Implement JWT auth middlewarebackend83.4
Convert React app to PWA with offline supportfrontend80.1
Build codebase indexer for LLM context windowsfrom-scratch81.9
Fix broken responsive layoutfrontend76.9
Add caching layer to eliminate slow SSR page loadsfull-stack87.5
Write Kubernetes manifests for Node.js microservicefull-stack84.6
Harden insecure Docker setup with 12 vulnerabilitiescode-review92.8
Dockerize Node.js monorepofull-stack83.8
Implement zero-trust API authentication layerbackend45.8
Remove AI slop and over-engineering from codebaserefactoring88.4
Implement transformer inference engine with KV cachefrom-scratch53.8
Build production website with auth and members areafrontend65.7
Build CLI tool with subcommands and configfrom-scratch75.0
Write integration tests for payment flowcode-review70.3
Build LLM evaluation harness with structured gradingbackend40.5
Refactor monolithic handler to CQRSrefactoring72.8
Fix hallucination and context window bugs in RAG agentbackend75.4
Build real-time portfolio risk calculatorbackend52.5
Fix data integrity bugs in denormalized e-commerce schemadebugging50.8
Write complex SQL report with window functionsbackend78.5
Fix deadlocking transaction patterns in Flask appbackend73.9
Debug and fix 6 broken database triggers and constraintsdebugging84.0
Find and fix 4 hidden backdoors in Flask appdebugging92.5
Optimize slow Postgres queries in Flask appbackend74.3
Add slash commands and moderation to Discord botbackend54.9
Fix 12 WCAG accessibility violations in checkout formfrontend83.2
Add virtual scrolling to table rendering 5000 rowsfrontend62.8
Fix Node.js stream backpressure causing OOM on large filesbackend81.0
Fix auth bypass vulnerabilitydebugging92.1
Fix flaky test suitedebugging79.5
Implement Stripe webhook handlerbackend59.4
Add cursor-based pagination to REST APIbackend62.6
Fix N+1 query in dashboardbackend63.8
Add WebSocket real-time updatesfull-stack60.6
Fix memory leak in event handlerdebugging60.4
Debug race condition in worker pooldebugging87.3
Fix React hydration mismatchfrontend56.2
Build REST API from scratchfrom-scratch90.9