Block Diagrams
ASCII architecture diagrams, data-flow, design trade-offs, borrowings
1.Megatron LM
2.ZeRO
3.Switch Transformers
4.PipeDream
5.nnScaler
6.p3
7.GPipe
8.BitNet LLM microsoft
9.AutoCCL
10.Efficient Schedule Construction for Distributed Execution of Large DNN Models
11.A3C Asynchronous Methods for Deep Reinforcement Learning
12.Demystifying NCCL
13.EMLIO
14.GPU Perf modeling LLM
15.Immediate .Comm Dist tasks GPU
16.MSCCL++
17.GPU Initiated net NCCL
18.pensieve sigcomm17