Huawei, USTC and Peking University Achieve 58% Speedup on Ascend A3 for MoE Model Training

According to Beating, researchers from Huawei, University of Science and Technology of China (USTC), and Peking University unveiled HyperParallel-MoE, a compiler scheduling framework designed for Ascend A3 chips. The framework reduces latency in MoE expert computation modules by 36%, achieving a 58% overall data processing speedup (1.49–1.58x faster) in 256-node clusters running 671B parameter DeepSeek-style models, while single-step training speed improved by 8–9%.
Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments