Learning fastai part 2

the last three days i learned: reimplemented 50% of the forward pass and backward pass of pipeline parallelism, reversed 20% of a balanced bracket classifier circuit, reversed 10% of the world model of OthelloGPT



the last three days i learned: reimplemented [95% of the backward pass of pipeline parallelism, 100% of data transfer (supports backward pass, but not multi-node yet) in torchgpipe, and made some more progress on the multi-node notification mechanism in horovod], reversed [20% of the world model of OthelloGPT and 30% of a balanced bracket classifier circuit]







1 Like

the last four days i learned: reimplemented [100% of the forward pass of the pipeline (but not multi-node or 3D parallelism yet), made some little progress on ParallelContext in OSLO and FSDP], reversed [23% of the world model of OthelloGPT and 45% of the balanced bracket classifier circuit]





the last three days i learned: reimplemented (100% of the backward pass of pipeline parallelism, 60% initializes parallel groups in 3D parallelism, 5% of CPU offload, and 60% partitioning of model states in FSDP) and learned some more about superposition








the last three days i learned: reimplemented (100% of sharding params in FSDP, 5% of rebuilding parameters in the forward and backward pass in FSDP), more on superposition, and transformer circuit







Props for sticking with this since November. Consider me impressed.

1 Like

iā€™m not going to stop :wink:

the last four days i learned: reimplemented (70% of communication primitives (support training) & 70% of initializing parallel groups in 3D parallelism, 100% ParallelMLP), reversed (80% of balanced bracket classifier circuit, 27% of the world model of OthelloGPT), 50% of why SoLU works





the last three days i learned: 100% of initializing parallel groups in 3D parallelism, 80% of communication primitives (close to fully parallelizing transformer from scratch), 30% of training tensor parallelism with pipeline parallelism (starting with single node first), and 20% of MLP as key-value memories








This is from 6 days ago; I forgot to post (yes, Iā€™m still extremely consistent).

the last three days i failed :partying_face: : didnā€™t manage to make progress on ZeRO-offload and ZeRO-1 + a few other stuff (what next? try again.)

This is from yesterday; I forgot to post (yes, Iā€™m still extremely consistent).

the last three days i learned: reimplemented (15% of zero-1, zero-offload scheduler), reversed (30% of IOI circuit, 60% of the world model of OhelloGPT)








the last three days i learned: reimplemented (20% of ZeRO-1, 15% turn any :hugging: transformers model to 3D parallelism, 30% fully parallelize a transformer model), reversed (40% of IOI circuit), more on transformer circuit







the last three days i learned: reimplemented (60% fully parallelize a transformer, 10% multi-node 3D parallelism (support training)), reversed (50% of IOI circuit, 1% of addition circuit)






the last three days i learned: reimplemented (70% fully parallelize a transformer, 15% multi-node 3D parallelism), reversed 10% of modular addition circuit








the last four days i learned: wrote 80% fully parallelized a transformer, 20% multi-node 3D parallelism, and reversed 60% of IOI circuit





1 Like

the last two days i learned: 88% of fully parallelizing a transformer and 23% of multi-node 3D parallelism + other stuff






the last three days I learned: reversed 85% of balanced bracket classifier, wrote 25% of multi-node 3D parallelism, and 90% of fully parallelizing a transformer







the last three days i learned: wrote 50% of turn any :hugging: transformers model to 3D parallelism, reversed 87% of balanced bracket classifier circuit





the last three days i learned: wrote 80% of turn any :hugging: transformers model to tensor parallelism, more progress on IOI circuit





the last three days i learned: wrote 100% of automatically parallelize any :hugging: transformers model in tensor parallelism, 20% of multi-node pipeline parallelism, and reversed 90% of IOI circuit