Data Sources & Methodology
RankedAGI bring you the most comprehensive, up-to-date AI model benchmarks in one place—with a focus on clarity, speed, and real-world relevance.
Where the data comes from
Benchmark results are sourced directly from:
- Model Providers: Official release announcements (blogs, videos, X posts) from companies like OpenAI, Anthropic, and xAI.
- Benchmark Creators: Trusted third-party sites such as LiveBench, LMSYS, Aider, and SWE-Bench, updated directly from their official results.
All data is manually collected—no scraping, no APIs—just careful curation from public, primary sources. For each model, the last update timestamp is shown so you know how fresh the info is.
Benchmarks Tracked
Benchmarks are selected based on popularity, availability, and how well they reflect real-life AI usage. Here's what's currently cover:
- Code LiveBench
- Aider Polyglot
- SWE-Bench Verified
- Codeforces ELO
- WebDev Arena
- LiveCodeBench v5
- MMLU Pro
- GPQA Diamond
- Reason LiveBench
- Math LiveBench
- AIME 2025 I & II
- Human Eval(+)
- Halluc. Hughes
- Aidan Bench
- IF LiveBench
- Data LiveBench
- Lang LiveBench
- Avg LiveBench
- IF Evaluation
- And more...
Plus key details like Context Length, Max Output Length, Release Date, and Knowledge Cutoff. I'm always adding new benchmarks—think image models and AI SaaS tools—as the field evolves.
Ranking
Right now, I list raw benchmark scores as provided and calculate a rank for each model within every benchmark—simple and transparent. If a model lacks data for a benchmark, I skip it without penalty.
Coming soon: our own RankedAGI Averages (e.g., RankedAGI Code, RankedAGI Reasoning). These will blend multiple benchmarks into category scores with clear formulas and weights, designed to give you a quick, balanced view. Missing data won't skew the results—I'm building this with fairness in mind.
Keeping It Fresh
Data is updated multiple times a day, every day. On big release days, new models hit the site within an hour of their announcement. Each model's page shows its last update date, so you're always working with the latest insights.
Transparency & Community
I strive for accuracy, but I'm human, so typos or source errors can happen. If you spot something off, flag it! I'm open to community feedback on missing or incorrect data.
A Note on Data
RankedAGI doesn't own any benchmark data, I compile and display publicly available results from providers and benchmark creators, respecting their rights. All use is for informational purposes, keeping us clear of copyright concerns.