---
title: "Claude 3.7 Sonnet Benchmarks - RankedAGI"
description: "Detailed benchmark and metadata record for Claude 3.7 Sonnet."
source: "https://rankedagi.com/models/claude-3.7-sonnet"
---

# Claude 3.7 Sonnet

| Field | Value |
| --- | --- |
| Organization | Anthropic |
| License | Proprietary |
| Version | Latest |
| Released | 2025-02-24 |
| Context window | 200K |
| Knowledge cutoff | 2024-10-01 |
| Input cost per million tokens | $3 |
| Output cost per million tokens | $15 |
| Model ID | claude-3-7-sonnet-20250219 |
| Last updated | 2026-05-04T03:43:08.354Z |

## Benchmarks

| Benchmark | Category | Value | Description | Source |
| --- | --- | --- | --- | --- |
| RankedAGI Coding | coding | 54.6% | RankedAGI Coding Score |  |
| SWEBench Verified | coding | 70.3% | Agentic Coding | https://www.swebench.com/index.html |
| RankedAGI Agentic | Agents | 45.9% | RankedAGI Agentic Score |  |
| ChatArena (LMSYS) | coding | 1326 | ChatArena (LMSYS) Coding ELO Score with Style Control |  |
| CyberGym | safety | 14.5% |  |  |
| Aider Polyglot | coding | 60.4% | Aider Polyglot Code Completion Benchmark |  |
| LiveBench Coding Score (old) | coding | 67.5% | LiveBench Coding Score old version |  |
| RankedAGI Reasoning | reasoning | 40.7% | RankedAGI Reasoning Score |  |
| Humanity's Last Exam | reasoning | 8.0% | Multidisciplinary Reasoning (no tools) |  |
| GPQA Diamond | reasoning | 68.0% | Generalized Prefix Question Answering Score (Reasoning) PhD Level Reasoning |  |
| Text Arena | general | 1365 | ChatArena (LMSYS) ELO Score | https://arena.ai/leaderboard/text |
| AIME 2024 | math | 23.3% | AIME 2024 Competition Math |  |
| NYT Connections Extended | reasoning | 19.2% | NYT Connections Extended Version | https://github.com/lechmazur/nyt-connections/ |
| MMMU | imaging | 71.8% | Multimodal Understanding College-level visual problem-solving |  |
| LiveBench Average | general | 65.6% | LiveBench Average Score old version |  |
| Instruction Following Evaluation | instruction-following | 90.8% | Instruction Following Evaluation |  |
| LiveBench Coding 25.4 | coding | 32.4% | LiveBench Coding Score 2025-04 | https://livebench.ai/ |
| LiveBench Data Analysis | general | 63.4% | LiveBench Data Analysis Score | https://livebench.ai/ |
| LiveBench Language | general | 56.8% | LiveBench Language Score | https://livebench.ai/ |
| RankedAGI Math | math | 48.7% | RankedAGI Math Score |  |
| RankedAGI Overall | general | 48.3% | Overall RankedAGI score |  |
| GDPval-AA Elo | Agents | 1047 | Office Tasks (Artificial Analysis) |  |
| SvelteBench old | coding | 56.7% | SvelteBench - Benchmark for Svelte (Old v1) | https://khromov.github.io/svelte-bench/v1/v1-benchmark-results-merged.html |
| Code DesignArena | Design | 1226 |  | https://www.designarena.ai/leaderboard/code |

## Model Links

| Title | URL |
| --- | --- |
| Release | https://www.anthropic.com/news/claude-3-7-sonnet |