Aman Choudhri

2025-07-19

Using basic linear algebra to prove that "orthogonalizing" the gradient gives the optimal loss improvement under a norm constraint.

2025-07-12

Framing self-attention in terms of convex combinations and similarity matrices, aiming to precisely ground common intuitive explanations.