Released date

Model

Last Updated 5/22/2025

Claude Sonnet 4 (Nonthinking)

Anthropic's latest-generation workhorse model, offering a balance of performance and speed.

Released Date: 5/22/2025

Avg. Accuracy:

73.1%

Latency:

27.53s

Performance by Benchmark

Benchmarks

Accuracy

Rankings

FinanceAgent

43.5%

( 4 / 24 )

43.5%

4 / 24

CorpFin

59.9%

( 16 / 39 )

59.9%

16 / 39

CaseLaw

85.2%

( 5 / 62 )

85.2%

5 / 62

ContractLaw

72.4%

( 8 / 69 )

72.4%

8 / 69

TaxEval

73.5%

( 24 / 49 )

73.5%

24 / 49

MortgageTax

74.6%

( 11 / 29 )

74.6%

11 / 29

Math500

90.3%

( 14 / 45 )

90.3%

14 / 45

AIME

38.5%

( 18 / 39 )

38.5%

18 / 39

MGSM

92.9%

( 3 / 43 )

92.9%

3 / 43

LegalBench

81.5%

( 11 / 67 )

81.5%

11 / 67

MedQA

90.3%

( 14 / 47 )

90.3%

14 / 47

GPQA

69.4%

( 12 / 40 )

69.4%

12 / 40

MMLU Pro

79.4%

( 14 / 40 )

79.4%

14 / 40

MMMU

72.6%

( 9 / 26 )

72.6%

9 / 26

Academic Benchmarks

Proprietary Benchmarks (contact us to get access)

Cost Analysis

Input Cost

$3.00 / M Tokens

Output Cost

$15.00 / M Tokens

Input Cost (per char)

$2.41 / M chars

Output Cost (per char)

$4.89 / M chars

Performance by Benchmark

Cost Analysis

Join our mailing list to receive benchmark updates on