Learning fastai part 2

the last two days i learned: wrote 100% of data parallelism, 28% of multi-node pipeline parallelism


the last three days i learned: wrote 43% of multi-node pipeline parallelism, some basics of MoE, and more about IOI circuit




the last three days i learned: wrote 67% of multi-node pipeline parallelism





(this is from 6 days ago)

the last three days i learned: 2% of making MoE work in 3D parallelism, and 72% of multi-node pipeline parallelism




the last few days i learned: 78% of multi-node pipeline parallelism, 30% of zero-1 (yes, i’ve been stuck that hard)




[this is from 13 days ago]

the last week i learned: 90% of multi-node pipeline parallelism, and a lot of other stuff

the last 6 days i learned:

the last three days i learned:


69/n the last 6 days i learned (spent half of my time debuging pipeline parallelism)







the last three days i learned: implemented 100% of multi-node pipeline parallelism (but there is a catch) + …




the last three days i learned: hybrid tensor and data parallelism confirmed to work

the last three days i learned: didn’t manage to make much progress. but will try again





the last three days i learned: implemented 2% of sequence parallelism, 1.5% of MoE that works in 3D parallelism, but am still stuck on making ZeRO-1 work with hybrid parallelism




the last three days i learned:





the last three days i learned: fixed a convergence bug in zero-1 (for real this time), 2% of MoE that works in 4D parallelism

the last three days i learned: 4% of MoE that works in 4D parallelism + …











the last three days i learned: implemented 7% of MoE that works in 4D parallelism

the last three days i learned: 15% of MoE that works in 4D parallelism + …



(this is from 9 days ago)

the last three days i learned: didn’t manage to make much progress, but will try again




(this is from 6 days ago)

the last three days i learned: 27% of MoE that works in 4D parallelism