A.I. News Move over DeepSeek: Alibaba's Qwen2.5-Max surpasses DeepSeek-V3 in benchmarks

Gandalf_The_Grey · Jan 29, 2025

The news headlines for the last week have been dominated by DeepSeek thanks to the launch of its new reasoning model, R1, which improves responses to queries. DeepSeek's main non-reasoning model, DeepSeek-V3 arrived in December with impressive benchmark scores of its own, but now, Chinese firm Alibaba has released Qwen2.5-Max which surpasses DeepSeek-V3, and in some tests GPT-4o-0806 and Claude-3.5-Sonnet-1022.

Similar to DeepSeek, Qwen2.5-Max is touchy about Chinese political issues, it doesn't even answer those questions, on Qwen Chat, it just says you've exceeded your quota limit when you try those queries, but answers fine when you change the topic.

Some benchmarks that Alibaba used to test its model against the competition included MMLU-Pro, which tests knowledge through college-level problems, LiveCodeBench, which assesses coding capabilities, LiveBench, which comprehensively tests the general capabilities, and Arena-Hard, which approximates human preferences.

Move over DeepSeek: Alibaba's Qwen2.5-Max surpasses DeepSeek-V3 in benchmarks

Alibaba has just released its latest model, Qwen2.5-Max. It surpasses DeepSeek-V3 in many benchmarks and can be tested out now on Qwen Chat.

www.neowin.net

mlnevese · Jan 29, 2025

looks like it's Chinese AI week...

badboy · Jan 30, 2025

I read somewhere that ESET says that it is better not to use this AI, as all data is sent to China and stored there, including voice queries.

Search

A.I. News Move over DeepSeek: Alibaba's Qwen2.5-Max surpasses DeepSeek-V3 in benchmarks

Gandalf_The_Grey

Level 85

Move over DeepSeek: Alibaba's Qwen2.5-Max surpasses DeepSeek-V3 in benchmarks

mlnevese

Level 28

badboy

Level 1