cjyuResearch

Routing, Differentiability, and Pareto in MoE

Published on: 3-31-2025

Difficulties with routing in the SMoE paradigm

Under construction. I pinky promise I'll come back to do this article after I finish my paper.
Nonlinear Diffusion Distributions