Modular Manifolds: Co-designing Stability for Large-Scale AI
Manage episode 508602573 series 3690682
We explore Jeremy Bernstein's manifold-based approach to AI stability: constraining weight matrices to lie on a Stiefel manifold keeps singular values near one, making layers behave like rotations and improving predictability. Extending to modular manifolds, we treat each block as its own manifold with its own norm, and compose them so constraints stack cleanly, enabling automatic learning-rate budgeting across transformers. Along the way we compare to standard tricks like layer norm and gradient clipping, and discuss the non-Riemannian geometry that may unlock new paths to reliable, scalable AI training.
Citation:
Jeremy Bernstein, "Modular Manifolds",
Thinking Machines Lab: Connectionism, Sep 2025.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC
1321 حلقات