APEX
Back to models

GPT 5.3 Codex Spark

OpenAI

200K context$0.75/M input$4.50/M output
1668peak 1670

Avg Score

79.3

Avg Cost

$0.32

Score/$

248.9

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

from-scratchmedium
2467
frontendexpert
2329
backendeasy
2318
code-reviewhard
2311
multi-languagehard
2138
frontendhard
2107
from-scratchexpert
2075
from-scratcheasy
2023
code-reviewmedium
1871
code-review
1859
frontendeasy
1848
refactoringmedium
1833
debuggingmedium
1737
from-scratch
1726
full-stackmedium
1722
refactoring
1707
debuggingexpert
1698
backendexpert
1687
backendhard
1684
from-scratchhard
1683
frontend
1666
full-stack
1664
backend
1664
debugging
1651
backendmedium
1639
debugginghard
1635
full-stackhard
1633
frontendmaster
1632
multi-language
1581
frontendmedium
1563
backendmaster
1181
multi-languageexpert
1141
refactoringexpert
1051

All Results

TaskCategoryScore
Migrate Express monolith to modular architecturebackend62.1
Build interactive data visualization dashboardfrontend72.1
Add streaming SSE endpoint for LLM chatbackend85.3
Find and fix 4 hidden backdoors in Flask appdebugging90.9
Add Redis caching layer to Express APIbackend83.9
Build codebase indexer for LLM context windowsfrom-scratch33.8
Build real-time portfolio risk calculatorbackend77.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging73.1
Build 3D browser game with physics and multiplayer syncfrontend80.2
Implement multi-tenant row-level security in Postgresbackend80.0
Optimize bloated React bundle under 500KBfrontend75.8
Add file upload with S3 presigned URLsbackend80.8
Add retry logic and dead letter queue to Python task queuebackend83.6
Fix 12 WCAG accessibility violations in checkout formfrontend89.0
Harden insecure Docker setup with 12 vulnerabilitiescode-review93.9
Fix auth bypass vulnerabilitydebugging86.5
Implement Stripe webhook handlerbackend84.4
Fix flaky test suitedebugging86.0
Fix Node.js stream backpressure causing OOM on large filesbackend88.5
Code review: identify security vulnscode-review83.5
Debug race condition in worker pooldebugging88.6
Build MCP server for database managementbackend87.0
Fix memory leak in event handlerdebugging87.0
Implement JWT auth middlewarebackend51.6
Fix and extend Chrome browser extensionfrontend70.1
Add Google OAuth2 login to Express appfull-stack74.2
Build LLM evaluation harness with structured gradingbackend78.9
Add virtual scrolling to table rendering 5000 rowsfrontend80.8
Debug and fix 6 broken database triggers and constraintsdebugging86.8
Fix data integrity bugs in denormalized e-commerce schemadebugging86.8
Write Kubernetes manifests for Node.js microservicefull-stack91.8
Zero-downtime schema migrationfull-stack75.2
Optimize slow Postgres queries in Flask appbackend75.3
Build distributed node cluster with gossip protocolfrom-scratch82.9
Add WebSocket real-time updatesfull-stack87.5
Write integration tests for payment flowcode-review85.0
Migrate callback-hell Express app to async/awaitrefactoring82.5
Build production website with auth and members areafrontend78.1
Convert React app to PWA with offline supportfrontend82.5
Fix React hydration mismatchfrontend70.3
Remove AI slop and over-engineering from codebaserefactoring89.4
Build REST API from scratchfrom-scratch84.9
Split 1100-line god file into proper modulesrefactoring84.3
Fix race conditions in order matching enginebackend79.7
Fix N+1 query in dashboardbackend83.5
Fix broken GitHub Actions CI pipelinedebugging90.9
Replace console.log with structured loggingrefactoring79.2
Fix hallucination and context window bugs in RAG agentbackend79.2
Fix deadlocking transaction patterns in Flask appbackend79.3
Add caching layer to eliminate slow SSR page loadsfull-stack77.5
Add i18n with locale routing to Next.js appfull-stack76.7
Fix broken responsive layoutfrontend80.0
Write tests for untested legacy Flask servicecode-review79.5
Build RAG pipeline with vector searchbackend79.0
Port Python CLI to Rustmulti-language45.5
Add slash commands and moderation to Discord botbackend77.7
Add cursor-based pagination to REST APIbackend80.5
Build terminal UI dashboardfrom-scratch79.3
Build materialized view refresh pipeline for analyticsbackend72.1
Add rate limiting middlewarebackend87.5
Write complex SQL report with window functionsbackend80.8
Dockerize Node.js monorepofull-stack85.5
Build multi-tool LLM agent runtimebackend70.7
Implement transformer inference engine with KV cachefrom-scratch85.8
Build CLI tool with subcommands and configfrom-scratch71.0
Implement zero-trust API authentication layerbackend76.1
Build SaaS admin dashboard from scratchfrom-scratch80.9
Add GraphQL layer over REST APImulti-language84.5
Implement background job scheduler with persistencebackend80.5
Refactor monolithic handler to CQRSrefactoring54.5