Skip to content

feature matrix

Stella Biderman edited this page Feb 7, 2021 · 6 revisions

| | GPT-NeoX | NVIDIA Megatron | DeepSpeed Megatron |---------------------|-----------|----------------| | model parallel | ? | ? | | data parallel | y | ? | | pipeline parallel | y | ? | | other optimizations | ZeRO | ? | | benchmarks | | |

Clone this wiki locally