Meta-analysis of large language models: benchmarking DeepSeek-R1 against ChatGPT, Gemini, Qwen, and LLaMA

Shafique Ahmed Awan

doi:10.1186/s40537-025-01330-3

Meta-analysis of large language models: benchmarking DeepSeek-R1 against ChatGPT, Gemini, Qwen, and LLaMA

10.1186/s40537-025-01330-3

2025-12-19

0

OA

PDF

AI

Save

Share

Original

View PDF

Abstract

En 中文

The rapid evolution of large language models (LLMs), GPT-4 Turbo, Google Gemini, Qwen, Meta’s LLaMA 3.1, and DeepSeek-R1 has redefined the landscape of artificial intelligence. In the study, we conduct a hybrid meta-analysis integrating publicly available benchmarks, model cards, technical reports, and open-source repositories to evaluate LLMs across both performance and operational dimensions. Quantitative data were aggregated from standardized tasks such as MMLU (reasoning), HumanEval (code generation), FLORES-200 (translation), and TyDiQA (multilingual Q&A), complemented by efficiency metrics including FLOPs, GPU hours, inference latency, and subscription costs. A big data–driven KPI framework covering scalability index, data-throughput rate, energy per token, and training cost efficiency was applied to enable normalized, cross-model comparison. Results indicate that DeepSeek-R1 demonstrates strong coding and multilingual efficiency, ChatGPT-4 Turbo leads in reasoning accuracy, Gemini Ultra excels in multimodal inference, Qwen is competitive in Chinese-language tasks, and LLaMA 3.1 remains the most adaptable open-source option. Across datasets, DeepSeek-R1 achieved 80.2 ± 1.5% on HumanEval and 78.5 ± 1.8% on MMLU, compared with ChatGPT-4 Turbo’s 86.5 ± 1.9%; these gaps fall within observed heterogeneity (I2 = 14.6%). The findings highlight trade-offs among accuracy, scalability, and cost efficiency, emphasizing the need for transparent, sustainable, and multimodal LLM development.

Keywords:

DeepSeek

LLM

ChatGPT

Gemini

Qwen

LLaMA

AR

Ethics

Meta-analysis of large language models: benchmarking DeepSeek-R1 against ChatGPT, Gemini, Qwen, and LLaMA

Abstract

AI Summary

Journal

Researchers

Organization

Cited Papers

Citing Papers