Project

General

Profile

Actions

分布式相关 » History » Revision 2

« Previous | Revision 2/3 (diff) | Next »
jun chen, 07/26/2025 05:18 PM


分布式相关

从softmax到context parallell

针对超长上下文模型训练的序列并行方案简介

deepspeed-zero3 分享

Updated by jun chen 19 days ago · 3 revisions