His research focuses on both discovering interpretable structure in existing machine learning models and building new architectures that better exhibit such properties by design. His work has been published at top-tier conferences and journals (e.g. NeurIPS, ICLR, TPAMI) where he is also a frequent reviewer (receiving the top reviewer award at NeurIPS in 2024).
Previously, he was an Honorary Associate at the University of Wisconsin–Madison where he worked on scalable methods for expert specialization in language models.