Olawale Salaudeen

Olawale (Wale) Salaudeen

AI Center Fellow in Residence, Schmidt Sciences • Postdoctoral Researcher, MIT and the Broad Institute of MIT and Harvard

olawale [at] mit [dot] edu

On the 2025–26 academic job market, seeking tenure-track positions beginning Fall 2026.
CV Research Statement

I work on AI for society through the science of valid measurement and prediction of AI capabilities and risks, and the development of methods to ensure their reliability under real-world conditions. This enables us to anticipate failures before deployment and ensure reliable behavior under real-world distribution shifts, with translational impact in domains such as healthcare.

Bio ~250 wordsShort Bio ~175 wordsMinimal Bio ~75 words

Olawale (Wale) Salaudeen is an AI Center Fellow in Residence at Schmidt Sciences, a Postdoctoral Researcher at MIT (Healthy ML Lab, led by Prof. Marzyeh Ghassemi), and a Postdoctoral Scholar at the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard. Before his postdoctoral positions, he earned a Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign and the Stanford Trustworthy AI Research (STAIR) Lab at Stanford University, advised by Prof. Sanmi Koyejo.

He works on AI for society through the science of valid measurement and prediction of AI capabilities and risks, and the development of methods to ensure their reliability under real-world conditions. This enables us to anticipate failures before deployment and ensure reliable behavior under real-world distribution shifts, with translational impact in domains such as healthcare.

He has received a Sloan Scholarship, a Beckman Graduate Research Fellowship, a GEM Associate Fellowship, and an NSF Miniature Brain Machinery Traineeship. He was recognized with a Best Paper Award at the NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle, and was named a Top Area Chair at NeurIPS 2025. He has interned at Sandia National Laboratories (w/ Dr. Eric Goodman), Google Brain (w/ Dr. Alex D’Amour), Cruise LLC, and the Max Planck Institute for Intelligent Systems (w/ Dr. Moritz Hardt).

He received a Bachelor of Science in Mechanical Engineering with minors in Computer Science and Mathematics from Texas A&M University.

Olawale (Wale) Salaudeen is an AI Center Postdoctoral Fellow in Residence at Schmidt Sciences, a Postdoctoral Researcher at MIT in the Healthy ML Lab led by Prof. Marzyeh Ghassemi, and a Postdoctoral Scholar at the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard. He works on AI for society through the science of valid measurement and prediction of AI capabilities and risks, and the development of methods to ensure their reliability under real-world conditions. This enables us to anticipate failures before deployment and ensure reliable behavior under real-world distribution shifts, with translational impact in domains such as healthcare. He completed his Ph.D. in Computer Science at the University of Illinois at Urbana-Champaign and the Stanford STAIR Lab, advised by Prof. Sanmi Koyejo. His work has been supported by several honors, including the Sloan Scholarship, Beckman Graduate Research Fellowship, and GEM Associate Fellowship. He has interned at Google Brain, Cruise, and the Max Planck Institute for Intelligent Systems, and holds a B.S. in Mechanical Engineering from Texas A&M University.

Olawale (Wale) Salaudeen is an AI Center Postdoctoral Fellow in Residence at Schmidt Sciences, a Postdoctoral Researcher at MIT and the Broad Institute of MIT and Harvard. He works on AI for society through the measurement and control of AI capabilities and risks. He earned his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign and the Stanford STAIR Lab.

Selected Honors

Schmidt Sciences AI Center Fellowship • Best Paper, NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle • NeurIPS 2025 Top Area Chair • NYU Tandon Faculty First-Look Fellow • Georgia Tech FOCUS Fellow • Sloan Scholarship • Beckman Graduate Research Fellowship • GEM Associate Fellowship • NSF Miniature Brain Machinery Traineeship • ICML 2022 Top Reviewer (10%)

Selected Research Directions

My research develops the science of understanding, measuring, and improving the reliability of AI systems under real-world change. I work across three interconnected themes:

Valid Measurement and Prediction of AI Capabilities and Risks

AI systems exhibit jagged intelligence—they excel at some tasks but fail at others that share a common human capability. I develop measurements of AI-specific latent traits, capabilities, and risks to enable less jagged, more predictable behaviors across real-world settings.

AI Construct Lexis
Measurement to Meaning: A Validity-Centered Framework for AI Evaluation. In Review (preliminary in NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle)
Toward an Evaluation Science for Generative AI Systems. The Bridge, NAE 2025
ImageNot: A Contrast with ImageNet Preserves Model Rankings. TMLR 2026 (To appear)
On Evaluating Methods vs. Evaluating Models. NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle (Best Paper)

Characterizing and Intervening on Causal and Spurious Mechanisms of AI Behavior

AI systems often rely on spurious correlations, performing well in training environments but failing when conditions shift. My work identifies and disentangles causal from spurious mechanisms so models rely on stable, causal patterns instead.

Aggregation Hides OOD Generalization Failures from Spurious Correlations. NeurIPS 2025 (Spotlight)
Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified? TMLR 2025 (J2C) + ICLR 2026 Journal Track
Causally Inspired Regularization Enables Domain General Representations. AISTATS 2024
On Domain Generalization Datasets as Proxy Benchmarks for Causal Representation Learning. NeurIPS 2024 CRL Workshop (Oral)

Inference-Time Adaptation of AI Systems

AI behaviors often become unreliable when deployment conditions change. I develop methods that use proxy variables and concept-based representations to adapt model behavior at inference time, without retraining.

Adapting to Latent Subgroup Shifts via Concepts and Proxies. AISTATS 2023
Proxy Methods for Domain Adaptation. AISTATS 2024
Improving Single-round Active Adaptation: A Prediction Variability Perspective. TMLR 2025

See all publications →

Selected Recent News

Fall 2025. paper Our NeurIPS spotlight paper, Aggregation Hides OOD Generalization Failures from Spurious Correlations, was featured in MIT News.

Fall 2025. paper Our work On Evaluating Methods vs. Evaluating Models received a best paper award at the NeurIPS Workshop on Evaluating the Evolving LLM Lifecycle.

Fall 2025. paper Our policy brief on validating claims about AI is now available.

Fall 2025. paper Three papers accepted to NeurIPS 2025 (main track), including one spotlight: (i) Aggregation Hides OOD Generalization Failures from Spurious Correlations (Spotlight), (ii) On Group Sufficiency Under Label Bias, and (iii) Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness.

Fall 2025. paper Two papers accepted at the NeurIPS Workshop on Evaluating the Evolving LLM Lifecycle, including one oral: (i) On Evaluating Methods vs. Evaluating Models (Oral) and (ii) Measurement to Meaning: A Validity-Centered Framework for AI Evaluation.

Fall 2025. service I am co-organizing the workshop on The Science of Benchmarking and Evaluating AI at EurIPS 25 in Copenhagen.

Fall 2025. paper Our paper Improving Single-round Active Adaptation: A Prediction Variability Perspective is accepted at TMLR.

Summer 2025. paper Our paper on the limitations of domain generalization benchmarks — Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified? — is accepted at TMLR with J2C certification (ICLR 2026 Journal Track).

Summer 2025. paper Our preprint Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead is now available on arXiv.

Summer 2025. paper Our preprint Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness is now available on arXiv.

Summer 2025. service I am serving as a program chair for the Machine Learning for Health (ML4H) conference in San Diego.

Summer 2025. position I will spend the next year at Schmidt Sciences in NYC as a Visiting AI Scientist and AI Center Fellow in Residence.

Spring 2025. position I joined the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard as a postdoctoral scholar.

Spring 2025. paper Our paper Toward an Evaluation Science for Generative AI Systems appeared in The Bridge (National Academy of Engineering).

Spring 2025. talk I gave a talk on addressing distribution shifts at the MIT LIDS Postdoc NEXUS meeting.

Older News

Winter 2025. service I am co-organizing the AI for Society seminar at MIT.

Winter 2025. paper Our paper What's in a Query: Polarity-Aware Distribution-Based Fair Ranking will appear at WWW 2025.

Winter 2025. honors I was selected as an NYU Tandon Faculty First-Look Fellow.

Winter 2025. service I am co-organizing the 30th Annual Sanjoy K. Mitter LIDS Student Conference at MIT.

Winter 2025. honors I was selected as a Georgia Tech FOCUS Fellow.

Fall 2024. paper Our paper On Domain Generalization Datasets as Proxy Benchmarks for Causal Representation Learning will appear at the NeurIPS 2024 Workshop on Causal Representation Learning as an oral presentation.

Fall 2024. position I joined the Healthy ML Lab, led by Prof. Marzyeh Ghassemi, at MIT as a postdoctoral associate.

Spring 2025. paper Our preprint on domain generalization benchmarks — Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified? — is now available on arXiv.

Summer 2024. talk I gave a talk on our work on distribution shift at Texas State's Computer Science seminar.

Summer 2024. talk I gave a talk on our work on distribution shift at UT Austin's Institute for Foundations of Machine Learning (IFML).

Summer 2024. honors I successfully defended my PhD dissertation, "Towards Externally Valid Machine Learning: A Spurious Correlations Perspective."

Spring 2024. talk I gave a talk on AI for critical systems at the MobiliT.AI forum.

Spring 2024. talk I gave a talk at UIUC Machine Learning Seminar on our work on the external validity of ImageNet.

Spring 2024. paper Our preprint ImageNot: A contrast with ImageNet preserves model rankings is now available on arXiv.

Winter 2024. paper Two papers on machine learning under distribution shift will appear at AISTATS 2024.

Fall 2023. position I joined the Social Foundations of Computation department at the Max Planck Institute for Intelligent Systems as a Research Intern working with Dr. Moritz Hardt.

Spring 2023. honors I passed my PhD Preliminary Exam.

Spring 2023. position I will join Cruise LLC's Autonomous Vehicles Behaviors team as a Machine Learning Intern.

Fall 2022. position I moved to Stanford University as a visiting student with Prof. Sanmi Koyejo.

Summer 2022. honors Honored to be selected as a top reviewer (10%) of ICML 2022.

Summer 2022. position Joining Google Brain (now Google DeepMind) in Cambridge, MA as a Research Intern.

Fall 2021. paper Our paper Exploiting Causal Chains for Domain Generalization was accepted at the NeurIPS 2021 Workshop on Distribution Shifts.

Fall 2021. honors Selected as a Miniature Brain Machinery (MBM) NSF Research Trainee.

Summer 2021. honors I was selected to receive an Illinois GEM Associate Fellowship.

Spring 2021. honors I passed my Ph.D. qualifying exam.

Spring 2020. honors I was selected to receive a 2020 Beckman Institute Graduate Fellowship.

Fun Facts

I was born in Nigeria and moved to Dallas, Texas at a young age. I played basketball and water polo in high school. I'm a loyal (if perpetually disappointed) Dallas sports fan, a cinephile, a social dancer (Latin and swing mostly), and a regular at live standup comedy. I'm full of hot takes, most of which I wouldn't die on, but am always eager to share and defend for fun. I also once won an intramural cornhole championship.