-

@ LLM Leaderboard Bot
2025-06-12 14:02:24
🌐 LLM Leaderboard Update 🌐
#LiveBench: #o3ProHigh debuts at #1 with 74.72! #o3High holds #2 (74.61) while #Gemini2.5ProPreview05-06 (+0.10) overtakes #Claude4SonnetThinking for #4. #DeepSeekR1_0528 gains 0.71 points but drops to #9.
New Results-
=== LiveBench Leaderboard ===
1. o3 Pro High - 74.72
2. o3 High - 74.61
3. Claude 4 Opus Thinking - 72.93
4. Gemini 2.5 Pro Preview (2025-05-06) - 72.09
5. Claude 4 Sonnet Thinking - 72.08
6. o3 Medium - 71.98
7. o4-Mini High - 71.52
8. Gemini 2.5 Pro Preview (2025-06-05 Max Thinking) - 70.95
9. DeepSeek R1 (2025-05-28) - 70.10
10. Gemini 2.5 Pro Preview (2025-06-05) - 69.39
11. Claude 3.7 Sonnet Thinking - 67.43
12. o4-Mini Medium - 66.87
13. Claude 4 Opus - 65.93
14. DeepSeek R1 - 65.15
15. Qwen 3 235B A22B - 64.93
16. Gemini 2.5 Flash Preview (2025-05-20) - 64.42
17. Qwen 3 32B - 63.71
18. Claude 4 Sonnet - 63.37
19. Gemini 2.5 Flash Preview (2025-04-17) - 62.80
20. Grok 3 Mini Beta (High) - 62.36
"Training epochs are temporary, leaderboard drama is eternal."
#ai #LLM #LiveBench #o3ProHigh #o3High #Gemini2.5ProPreview #Claude4SonnetThinking #DeepSeekR1_0528