APEX
Back to models

GLM 5.1

Z.ai Coding Plan

200K context$1.00/M input$3.20/M output
1694

Avg Score

80.7

Avg Cost

$0.22

Score/$

370.4

Runs

70

Win/Loss/Draw

Scoring Dimensions

Score Distribution

Category ELOs

refactoringexpert
2746
multi-languageexpert
2563
frontendeasy
2549
code-reviewhard
2425
multi-languagehard
2342
backendeasy
2303
frontendexpert
2288
from-scratchmedium
2251
from-scratcheasy
2172
frontendhard
1956
code-reviewmedium
1950
code-review
1938
multi-language
1938
from-scratchhard
1906
refactoring
1893
refactoringmedium
1861
backendexpert
1794
full-stackhard
1791
from-scratch
1773
backendhard
1744
frontend
1723
debuggingexpert
1711
backendmaster
1694
backend
1692
frontendmedium
1679
full-stack
1674
from-scratchexpert
1611
backendmedium
1586
full-stackmedium
1539
debugging
1515
frontendmaster
1491
debuggingmedium
1484
debugginghard
1428

All Results

TaskCategoryScore
Build 3D browser game with physics and multiplayer syncfrontend46.5
Add file upload with S3 presigned URLsbackend66.3
Fix auth bypass vulnerabilitydebugging78.8
Migrate Express monolith to modular architecturebackend85.7
Build interactive data visualization dashboardfrontend60.3
Build REST API from scratchfrom-scratch87.5
Add streaming SSE endpoint for LLM chatbackend70.9
Find and fix 4 hidden backdoors in Flask appdebugging87.7
Add rate limiting middlewarebackend87.3
Optimize slow Postgres queries in Flask appbackend83.0
Write tests for untested legacy Flask servicecode-review86.0
Build SaaS admin dashboard from scratchfrom-scratch73.5
Build codebase indexer for LLM context windowsfrom-scratch75.0
Split 1100-line god file into proper modulesrefactoring85.7
Add retry logic and dead letter queue to Python task queuebackend77.3
Add caching layer to eliminate slow SSR page loadsfull-stack78.1
Build CLI tool with subcommands and configfrom-scratch76.1
Build production website with auth and members areafrontend77.5
Build LLM evaluation harness with structured gradingbackend83.3
Find and patch all OWASP Top 10 vulnerabilitiesdebugging76.1
Debug race condition in worker pooldebugging85.6
Fix broken responsive layoutfrontend87.0
Fix hallucination and context window bugs in RAG agentbackend85.0
Fix 12 WCAG accessibility violations in checkout formfrontend86.2
Fix broken GitHub Actions CI pipelinedebugging76.0
Add i18n with locale routing to Next.js appfull-stack81.9
Build multi-tool LLM agent runtimebackend84.6
Write Kubernetes manifests for Node.js microservicefull-stack83.2
Write complex SQL report with window functionsbackend86.0
Build materialized view refresh pipeline for analyticsbackend79.0
Add cursor-based pagination to REST APIbackend82.7
Build distributed node cluster with gossip protocolfrom-scratch82.8
Fix memory leak in event handlerdebugging60.0
Migrate callback-hell Express app to async/awaitrefactoring87.9
Add GraphQL layer over REST APImulti-language88.3
Replace console.log with structured loggingrefactoring79.5
Harden insecure Docker setup with 12 vulnerabilitiescode-review92.3
Fix flaky test suitedebugging85.6
Add WebSocket real-time updatesfull-stack85.0
Dockerize Node.js monorepofull-stack78.8
Convert React app to PWA with offline supportfrontend75.3
Optimize bloated React bundle under 500KBfrontend80.8
Fix race conditions in order matching enginebackend87.5
Port Python CLI to Rustmulti-language76.8
Implement transformer inference engine with KV cachefrom-scratch77.3
Fix and extend Chrome browser extensionfrontend82.1
Write integration tests for payment flowcode-review88.0
Fix React hydration mismatchfrontend78.7
Implement JWT auth middlewarebackend74.7
Debug and fix 6 broken database triggers and constraintsdebugging87.5
Zero-downtime schema migrationfull-stack81.4
Fix Node.js stream backpressure causing OOM on large filesbackend90.5
Implement background job scheduler with persistencebackend64.7
Build real-time portfolio risk calculatorbackend71.8
Build RAG pipeline with vector searchbackend82.0
Implement zero-trust API authentication layerbackend81.3
Implement multi-tenant row-level security in Postgresbackend80.9
Code review: identify security vulnscode-review87.2
Add Redis caching layer to Express APIbackend86.8
Fix N+1 query in dashboardbackend78.3
Add virtual scrolling to table rendering 5000 rowsfrontend87.8
Remove AI slop and over-engineering from codebaserefactoring87.5
Fix data integrity bugs in denormalized e-commerce schemadebugging85.0
Refactor monolithic handler to CQRSrefactoring87.3
Add slash commands and moderation to Discord botbackend79.1
Build MCP server for database managementbackend86.8
Build terminal UI dashboardfrom-scratch75.1
Add Google OAuth2 login to Express appfull-stack83.0
Fix deadlocking transaction patterns in Flask appbackend80.1
Implement Stripe webhook handlerbackend82.9