John Hoffman
Staff Data Scientist, FAIR (Meta)
I'm a staff data scientist at Meta's Fundamental AI Research lab, where I drive evaluation and data strategy for large-scale AI systems. I co-led evaluation for Movie Gen and was a core contributor to SAM Audio, leading all human evaluations. Two of the projects I've worked on — SeamlessM4T and NLLB — were published in Nature. I've also worked on understanding and improving the reliability of our large-scale GPU compute infrastructure.
Before FAIR, I earned a PhD in astrophysics from Princeton, where I developed GPU-accelerated methods for astronomical time-series analysis. My thesis library, cuvarbase, was adopted by NASA's TESS pipeline and enabled a Nature-published discovery of ultracompact binary stars. I also worked in ML consulting and ad-tech before joining Meta.
Selected Publications
- SAM Audio Judge: A Reference-Free Audio Separation Evaluation Metric
- SAM Audio: Segment Anything in Audio
- Revisiting Reliability in Large-Scale Machine Learning Research Clusters
- Movie Gen: A Cast of Media Foundation Models
- SeamlessM4T: Massively Multilingual & Multimodal Machine Translation
- No Language Left Behind: Scaling Human-Centered Machine Translation