
On August 11th, Baichuan Intelligence released the open-source Baichuan-M2, a large-scale medical enhancement model.
OpenAI open-sourced two large-scale models on August 6th, featuring ultra-low deployment costs and the strongest medical capabilities. Five days later, Baichuan-M2, with its smaller size, surpassed OpenAI in medical capabilities, ranking first in the world among all open-source models.
Baichuan-M2 scored 60.1 on HealthBench. With its smaller size of 32 Bytes, it surpassed OpenAI's latest open-source model, gpt-oss120b (57.6), and surpassed all other currently open-source large-scale models, including Qwen3-235B, Deepseek R1, and Kimi K2.
To address the need for private model deployment in the medical field while maintaining user privacy, Baichuan Intelligence has significantly reduced the weight of the Baichuan-M2. The resulting model boasts near-lossless accuracy and can be deployed on a single RTX 4090 GPU, reducing costs to 57 times lower than the dual-node DeepSeek-R1 H20 deployment. Development and adaptation of mainstream domestic chips allows most medical institutions to rapidly deploy using existing hardware.
Furthermore, for scenarios requiring faster interaction speeds, such as emergency and outpatient clinics, the Baichuan-M2 MTP version, optimized based on the Eagle-3 architecture, achieved a 74.9% increase in token transfer speed in single-user scenarios.
With significantly enhanced medical capabilities, will the generalizability of the model decline? Baichuan has also verified the significant value of high-quality medical data in enhancing model generalizability. The M2 model's core performance in general functions, such as mathematics, command following, and writing, has improved, not decreased, indicating its potential application beyond healthcare.
Furthermore, Baichuan-M2 boasts capabilities comparable to GPT-5 for complex medical problems, surpassing several leading closed-source models. When GPT-5 was released, OpenAI highlighted its status as the only model in the world to score above 32 on the HealthBench Hard test. Baichuan-M2, with a score of 34.7, became only the second model in the world to do so.