0.WTN
- [ ] Dataloader的worker对训练推理的影响 https://zhuanlan.zhihu.com/p/673642279
- [ ] train & val & inference
- [ ] gradient accumulation
- [ ] mixed precision
- [ ] datasets (type, data structure)
- [ ] fully finetune, finetune, lora, pretrain
- [ ] prompt, 幻读, difficulties
- [ ] llama1,2, chatgpt
- [ ] bert,transformers
- [ ] deepspeed, fastchat, megatron, colossalAi ...
- [ ] torch.run, torch.launch, mp.spawn
大约 3 分钟