Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training
COLM 2024
Zexuan Zhong
zzhong@princeton.edu
I am working at xAI, where I've contributed to Grok 2, 3, and 4.
I completed my Ph.D. at Princeton University in 2024, advised by Prof. Danqi Chen. I received an M.S. from UIUC and a B.S. from Peking University.
NAACL 2024
ACL 2023 (Tutorial)