Learning fastai part 2

the last two days i learned: implemented sampling candidate APIs in ToolFormer and read 2/5 the GPT-3 paper





1 Like

TIL: wrote 1/2 half the filtering API calls in ToolFormer, how to generate responses that align with human preferences without human labelling (Constitutional AI paper), how a language model can answer questions that contain images (MCTR paper), global gradient clipping





the last three days i learned: implemented 90% of the ToolFormer paper (next refactor the code, add support for custom APIs, and benchmark it), how to evaluate language model’s behavior, assess dataset quality, and red team LMs


the last two days i learned: wrote 1/5 support batch and execute API calls in parallel for the ToolFormer paper, read 1/7 the superposition of artificial neurons, context distillation in AI alignment, and some basics of JAX




1 Like

the last six days i learned: 1/5 Dreamerv3, 1/5 editing memory in language models, sandwiching experiments in oversight models


1 Like

TIL: implemented 10/10 ToolFormer, read 1.5/5 DreamerV3, 1/5 Flamingo



1 Like

the last three days i learned: add inference to ToolFormer (the last time i forgot it), what cause catastrophic forgeting in ANN, quantitatively evaluate transfer learning, basic of GNN, langchain







1 Like

the last two days i learned: implemented 70% Prioritized Level Replay (PLR) paper, some basics of pfrl and langchain lib




2 Likes

the last 5 days i learned: more on REPAIED paper, ray framework





1 Like

the last three days i learned: how to quantitatively measure semantic similarity of different goal-conditioning embeddings, how GATO works, the world model in DreamerV3, how to train RL agents using only video, and about open-ended task systems in XLAND, hyperparamer tuning using ray, + torch_geometric






1 Like

the last five days i learned: reimplemented 0.5/10 Toy Models of Superposition and 1/10 flashattention, how dreamerv3 represents the latent state of an observation + ray framework