APEX
Back to models

Deepseek V4 Flash

OpenRouter

1049K context$0.13/M input$0.50/M output
1666peak 1667

Avg Score

79.6

Avg Cost

$0.06

Score/$

1307.6

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

multi-languagehard
2325
from-scratchmedium
2268
from-scratcheasy
2244
backendeasy
2223
code-reviewhard
2086
frontendhard
2040
refactoringexpert
2004
frontendexpert
1974
frontendeasy
1890
from-scratchhard
1826
backendexpert
1821
backendhard
1761
from-scratch
1746
frontend
1744
multi-language
1738
frontendmedium
1725
frontendmaster
1723
refactoringmedium
1701
refactoring
1696
backend
1689
debuggingexpert
1679
backendmaster
1679
full-stackhard
1667
full-stack
1658
full-stackmedium
1653
multi-languageexpert
1638
code-review
1636
from-scratchexpert
1625
code-reviewmedium
1609
debuggingmedium
1602
debugging
1555
backendmedium
1526
debugginghard
1497

All Results

TaskCategoryScore
Fix 12 WCAG accessibility violations in checkout formfrontend87.5
Build CLI tool with subcommands and configfrom-scratch69.7
Add streaming SSE endpoint for LLM chatbackend71.0
Build 3D browser game with physics and multiplayer syncfrontend80.2
Fix and extend Chrome browser extensionfrontend72.2
Find and patch all OWASP Top 10 vulnerabilitiesdebugging84.4
Migrate callback-hell Express app to async/awaitrefactoring86.2
Build SaaS admin dashboard from scratchfrom-scratch82.3
Add caching layer to eliminate slow SSR page loadsfull-stack85.8
Implement zero-trust API authentication layerbackend78.3
Add GraphQL layer over REST APImulti-language87.8
Build materialized view refresh pipeline for analyticsbackend82.5
Write integration tests for payment flowcode-review80.5
Build real-time portfolio risk calculatorbackend82.4
Build MCP server for database managementbackend83.8
Remove AI slop and over-engineering from codebaserefactoring84.1
Build distributed node cluster with gossip protocolfrom-scratch73.5
Write complex SQL report with window functionsbackend88.3
Port Python CLI to Rustmulti-language54.4
Add i18n with locale routing to Next.js appfull-stack81.1
Fix hallucination and context window bugs in RAG agentbackend86.9
Implement background job scheduler with persistencebackend79.8
Fix N+1 query in dashboardbackend86.2
Fix auth bypass vulnerabilitydebugging80.3
Harden insecure Docker setup with 12 vulnerabilitiescode-review88.3
Build REST API from scratchfrom-scratch90.6
Fix deadlocking transaction patterns in Flask appbackend84.3
Build codebase indexer for LLM context windowsfrom-scratch73.3
Migrate Express monolith to modular architecturebackend84.4
Find and fix 4 hidden backdoors in Flask appdebugging90.9
Split 1100-line god file into proper modulesrefactoring70.8
Build RAG pipeline with vector searchbackend73.0
Build interactive data visualization dashboardfrontend77.8
Build multi-tool LLM agent runtimebackend83.4
Fix memory leak in event handlerdebugging68.0
Build terminal UI dashboardfrom-scratch75.6
Fix broken responsive layoutfrontend80.5
Fix data integrity bugs in denormalized e-commerce schemadebugging81.0
Fix React hydration mismatchfrontend82.4
Fix race conditions in order matching enginebackend90.2
Add rate limiting middlewarebackend85.6
Fix Node.js stream backpressure causing OOM on large filesbackend87.2
Optimize slow Postgres queries in Flask appbackend88.5
Replace console.log with structured loggingrefactoring79.2
Add Google OAuth2 login to Express appfull-stack70.8
Build LLM evaluation harness with structured gradingbackend82.1
Add cursor-based pagination to REST APIbackend66.3
Add virtual scrolling to table rendering 5000 rowsfrontend77.5
Write tests for untested legacy Flask servicecode-review63.3
Zero-downtime schema migrationfull-stack83.8
Implement JWT auth middlewarebackend65.3
Implement transformer inference engine with KV cachefrom-scratch78.1
Optimize bloated React bundle under 500KBfrontend86.6
Refactor monolithic handler to CQRSrefactoring74.7
Debug and fix 6 broken database triggers and constraintsdebugging85.6
Convert React app to PWA with offline supportfrontend80.2
Implement multi-tenant row-level security in Postgresbackend73.3
Build production website with auth and members areafrontend72.9
Fix broken GitHub Actions CI pipelinedebugging81.5
Add slash commands and moderation to Discord botbackend79.5
Implement Stripe webhook handlerbackend67.6
Write Kubernetes manifests for Node.js microservicefull-stack86.5
Code review: identify security vulnscode-review75.6
Debug race condition in worker pooldebugging87.2
Add Redis caching layer to Express APIbackend76.0
Dockerize Node.js monorepofull-stack76.0
Add file upload with S3 presigned URLsbackend66.8
Add WebSocket real-time updatesfull-stack82.4
Add retry logic and dead letter queue to Python task queuebackend81.2
Fix flaky test suitedebugging87.8