Toolcue
  • Catalog
  • Guides
  • Benchmarks
  • News
  • Catalog
  • Guides
  • Benchmarks
  • News

Toolcue

Get the cue. Pick the right AI tool. · © 2026

Benchmarks

Is Toolcue helpful?

57
← All benchmarks
Code

SWE-bench Verified

Полное сравнение 20 моделей: процент реальных GitHub-issues, которые модель исправила автономно (SWE-bench Verified, 500 задач).

Data updated: 06/20/2026

Sources

  • SWE-bench
  • SWE-bench Verified leaderboard
  • Data snapshot

Benchmarks are published by test authors. Methodologies differ; scores do not replace reliability ratings in the catalog. Check the primary source before choosing.