"MOE"的搜索結果
2026-03-26
01:51

美團開源LongCat-Next:3B參數統一視覺理解、生成與語音

Meituan Longcat Team's open-source LongCat-Next is a multimodal model based on MoE architecture, integrating five capabilities including text, visual understanding, image generation, and speech. Its core design DiNA achieves unified task processing through discrete tokens, while the dNaViT used in the visual aspect enables excellent image generation performance. Compared with similar models, LongCat-Next demonstrates leading benchmark performance across various metrics, showcasing its advantages in multimodal understanding and generation.
展開
06:27

光標發布 Composer 2 技術報告,底座模型得分提升 70%

Cursor於3月25日發布Composer 2技術報告,揭示了Kimi K2.5模型的訓練方案,採用MoE架構,參數量達到1.04萬億。訓練分為兩階段,使用真實場景模擬進行強化學習,最終在CursorBench基準上取得61.3分,提升70%,且推理成本低於其他大模型API。
展開
02:27

Meituan Open-Sources 560B Parameter Theorem Proving Model, Achieving 97.1% Pass Rate on 72 Inferences Refreshing Open-Source SOTA

Meituan's LongCat team open-sourced LongCat-Flash-Prover on March 21, a MoE model with 560 billion parameters, focused on Lean4 formal theorem proving. The model is divided into three capabilities: automatic formalization, sketch generation, and complete proof generation, combining reasoning tools with the Lean4 compiler for real-time verification. Training employs the Hybrid-Experts Iteration Framework and HisPO algorithm to prevent reward manipulation. Benchmark tests show that the model has set records for open-source weight models in automatic formalization and theorem proving.
展開