LLM Translation Arena Results : Gemini 2.5 Flash Tops Rankings (29 th Sep, 2025)

Tenzin_Gayche · September 29, 2025, 6:08am

We conducted an LLM Arena-style head-to-head comparison with over 300 human votes to rank current closed-source models for translation quality. The results confirm two major intuitions: Gemini 2.5 Flash zero-shot translation is the current best performer, and the translation quality of recent Claude updates appears to be declining relative to competitors.

Models from the Claude 4 series (in Chinese) and DeepSeek were among the worst performers, failing to make the top 10 rankings.

Full Rankings

Top 10 Models (Chinese Translation)

Rank	Model Name	Score
1	google:gemini-2.5-flash-thinking	1091
2	anthropic:claude-3-5-sonnet-20241022	1050
3	google:gemini-2.5-flash	1043
4	google:gemini-2.5-pro-thinking	1037
5	anthropic:claude-3-7-sonnet-latest-thinking	1023
6	anthropic:claude-3-7-sonnet-latest	1017
7	google:gemini-1.5-flash	1000
8	google:gemini-2.5-pro	1000
9	google:gemini-1.5-pro	1000
10	anthropic:claude-3-opus-20240229	981

Top 10 Models (English Translation)

Rank	Model Name	Score
1	google:gemini-2.5-flash	1097
2	anthropic:claude-3-7-sonnet-latest-thinking	1095
3	google:gemini-2.5-pro-thinking	1067
4	anthropic:claude-3-5-sonnet-20241022	1027
5	google:gemini-2.5-pro	1008
6	google:gemini-1.5-pro	1008
7	anthropic:claude-sonnet-4-20250514	1006
8	anthropic:claude-3-opus-20240229	1002
9	google:gemini-2.5-flash-thinking	992
10	google:gemini-2.0-flash	981

Next Steps: Optimizing for Tibetan Buddhist Translation

We have identified Gemini 2.5 Flash as the definitive zero-shot baseline. We are now using new tools to compare its zero-shot output against various workflows and templates to find the best model and template combination for achieving high-fidelity Tibetan Buddhist translation.

Topic	Replies	Views
Awesome Translation Prompts 🛠️ AI Tools WG	48	October 23, 2024
Translation evaluation arena requirements for OP Pilot phase 1 🛠️ AI Tools WG	68	September 16, 2025
Comparing qualities of different translation sentences generated based on different prompts {Chinese/ Machine translation} 📊 MT Evaluation SIG	33	May 9, 2025
Tibetan Embedding Models 🛠️ AI Tools WG	5	March 26, 2026
🕉️ Dilgo Khyentse Speech SIG Home Dilgo Khyentse Speech SIG	30	July 8, 2025

LLM Translation Arena Results : Gemini 2.5 Flash Tops Rankings (29 th Sep, 2025)

Full Rankings

Top 10 Models (Chinese Translation)

Top 10 Models (English Translation)

Next Steps: Optimizing for Tibetan Buddhist Translation

Related topics