Lenovo's Qitian WA7785aG3 Server Achieves New High in DeepSeek 671B Large Model Performance: 6708 Tokens/s Throughput

Lenovo's Qitian WA7785aG3 Server Achieves New High in DeepSeek 671B Large Model Performance: 6708 Tokens/s ThroughputOn March 18th, Lenovo announced a significant breakthrough achieved by its first AMD AI large model training serverthe Lenovo Qitian WA7785aG3. When deploying the full-scale DeepSeek 671B parameter large language model on a single machine, the server achieved a remarkable peak throughput of 6708 tokens/s

Lenovo's Qitian WA7785aG3 Server Achieves New High in DeepSeek 671B Large Model Performance: 6708 Tokens/s Throughput

On March 18th, Lenovo announced a significant breakthrough achieved by its first AMD AI large model training serverthe Lenovo Qitian WA7785aG3. When deploying the full-scale DeepSeek 671B parameter large language model on a single machine, the server achieved a remarkable peak throughput of 6708 tokens/s. This achievement pushes the performance of single-server large model operation to new heights, signifying a significant improvement in AI large model training and inference efficiency.

The exceptional performance of the Lenovo Qitian WA7785aG3 is not accidental. It benefits from the powerful support of Lenovo's Wanquan heterogeneous computing platform and Lenovo's technological innovations in several areas. These innovations include: memory access optimization for efficient memory resource utilization; video memory optimization to maximize GPU processing power; innovative PCIe 5.0 fully interconnected architecture providing high-speed, stable data transmission channels; and selection of the optimal operators within the SGLang framework to maximize computational efficiency. The Lenovo team continuously optimized the entire process of the large model, from pre-training and post-training to inference, ultimately achieving this breakthrough.

Lenovo

In real-world application scenarios, the Lenovo Qitian WA7785aG3 demonstrates powerful processing capabilities. In simulated question-and-answer scenarios (context sequence length 128/1K), the server can support up to 158 concurrent requests, with a TPOT (processing time) of only 93 milliseconds and a TTFT (total turnaround time) of 2.01 seconds. In simulated code generation scenarios (context sequence length 512/4K), the concurrency reaches 140, with a TPOT of 100 milliseconds and a TTFT of 5.53 seconds. This data fully demonstrates the outstanding performance of the Lenovo Qitian WA7785aG3 in handling complex AI tasks.

According to Lenovo's official introduction, a single Lenovo Qitian WA7785aG3 server can meet the normal usage needs of a 1500-person enterprise. This indicates that the server not only achieves industry-leading performance but also possesses strong practical application value. This breakthrough is the result of the joint efforts of Lenovo's China Infrastructure Business Group, Lenovo Research's ICI laboratory, and AMD, showcasing the synergistic effect of strong collaboration. The teams collaborated extensively on joint design and collaborative optimization, ultimately achieving this breakthrough.

It's noteworthy that the 6708 tokens/s throughput is not the ultimate goal of the Lenovo and AMD collaboration. Both teams continue to explore new optimization methods, striving for even greater breakthroughs in the future. This demonstrates Lenovo and AMD's commitment to continuously improving and optimizing their large model server technology to meet the ever-growing demands of AI computing.

Previously, the Lenovo Qitian WA7780G3 server achieved a total throughput exceeding 2500 tokens/s when deploying the full-scale DeepSeek large model on a single machine. The performance improvement of the Lenovo Qitian WA7785aG3 server further elevates the single-machine deployment of this large model's inference performance to a new level, achieving a more than double performance increase. This reflects Lenovo's continuous technological innovation capabilities and keen grasp of market demand in the AI server field. The significant performance leap from the WA7780G3 to the WA7785aG3 in a short period highlights Lenovo's strong R&D capabilities and technological accumulation.

The success of the Lenovo Qitian WA7785aG3 server not only provides a more powerful hardware foundation for large model applications but also injects new vitality into the development of the AI industry. As AI technology continues to advance and application scenarios expand, the demand for high-performance computing is growing rapidly. The emergence of the Lenovo Qitian WA7785aG3 server will effectively meet this demand, providing more powerful computing power for more enterprises and developers, accelerating the popularization and application of AI technology.

This breakthrough also highlights Lenovo's competitiveness in the high-end server market. Lenovo has been committed to providing customers with advanced IT infrastructure solutions. The success of the Qitian WA7785aG3 server further solidifies Lenovo's leading position in the AI server field and lays a solid foundation for its future development. In the future, Lenovo will continue to invest in R&D, continuously launching more advanced AI server products to meet evolving market demands and contribute to the innovative development of the AI industry.

The success of the Lenovo Qitian WA7785aG3 server also sets a new benchmark for other AI server manufacturers. It not only showcases the potential of single-machine large model inference performance but, more importantly, proves that through continuous technological innovation and fine-grained optimization, the operational efficiency of large models can be significantly improved. This will encourage other manufacturers to increase R&D investment and jointly promote the advancement of AI server technology, contributing to the prosperous development of the AI industry.

The success of the Lenovo Qitian WA7785aG3 server is a perfect combination of technological innovation and industrial application. It not only brings performance improvements but, more importantly, provides a solid hardware guarantee for the widespread application of AI large models, promoting the in-depth application of AI technology in various industries, ultimately benefiting society and driving industrial progress. This is not only Lenovo's success but also the collective progress of the entire AI industry.


Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])