Papers: Large Model System
There are various approaches to categorize deep learning parallelisms:
- data parallelism
- model parallelism
- pipeline parallelism
- tensor parallelism
We tend to categorize existing parallel methods into two types:
- inter-operation: MP, PP
- intra-operation: DP, TP There may be other approaches other than the above four. But they can still be attributed into one of the two taxonomies.
# Papers
- [HPCA'23] MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism