Chulhee Yun

Ewon Assistant Professor

Kim Jaechul Graduate School of AI

KAIST

My name is Chulhee (I go by Charlie), and I am an Ewon Assistant Professor at KAIST Kim Jaechul Graduate School of AI (KAIST AI). I direct the Optimization & Machine Learning (OptiML) Laboratory at KAIST AI.

I finished my PhD from the Laboratory for Information and Decision Systems at Massachusetts Institute of Technology, where I was fortunate to study under the joint supervision of Prof. Suvrit Sra and Prof. Ali Jadbabaie. Before MIT, I was a master’s student in Electrical Engineering at Stanford University, where I worked with Prof. John Duchi. I finished my undergraduate program in Electrical Engineering at KAIST.

Email: {firstname}.{lastname}@kaist.ac.kr
Phone: +82-2-958-3919
Office: KAIST Seoul Campus Building #9, 9401

Job opening: As a member of the LLM Innovation Research Center, I am seeking an InnoCORE Fellow (eligible within five years of earning a PhD) to work on both theoretical and practical aspects of large language models (LLMs), with a particular focus on length generalization and reasoning capabilities. Please email me your CV if you’re interested or would like further information.

For prospective students: If you are curious about what kinds of research I do, please see this interview article (in Korean). I look for self-motivated graduate students with strong mathematical backgrounds. If you are an undergraduate student interested in interning at our lab, consider applying for summer/winter KAIST AI Research Internship (KAIRI) programs.

Interests

Deep Learning Theory
Optimization
Machine Learning Theory

Education

PhD in Elec. Eng. & Comp. Sci., 2016–2021

Massachusetts Institute of Technology
MSc in Electrical Engineering, 2014–2016

Stanford University
BSc in Electrical Engineering, 2007–2014

KAIST

News

[Jun 2025] Four papers got accepted to the 3rd Workshop on High-dimensional Learning Dynamics (HiLD) at ICML 2025.

[May 2025] Four papers got accepted to ICML 2025. Congrats to my co-authors!

[Feb 2025] I was awarded the Ewon Assistant Professorship at KAIST, a selective junior faculty award that provides a distinguished title and a three-year research grant to outstanding early-career professors.

[Jan 2025] Four papers got accepted to ICLR 2025. Congrats and thanks to all my co-authors!

[Jan 2025] I had the pleasure of attending IMS Young Mathematical Scientists Forum — Applied Mathematics, hosted by the National University of Singapore, as an invited speaker.

See all

Publications

Implicit Bias and Loss of Plasticity in Matrix Completion: Depth Promotes Low-Rank Solutions
Baekrok Shin, Chulhee Yun
ICML 2025 Workshop on High-dimensional Learning Dynamics (HiLD)

The Cost of Robustness: Tighter Bounds on Parameter Complexity for Robust Memorization in ReLU Nets
Yujun Kim, Chaewon Moon, Chulhee Yun
ICML 2025 Workshop on High-dimensional Learning Dynamics (HiLD)

From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning
Junsoo Oh, Jerry Song, Chulhee Yun
ICML 2025 Workshop on High-dimensional Learning Dynamics (HiLD)

Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training
Minhak Song, Beomhan Baek, Kwangjun Ahn, Chulhee Yun
ICML 2025 Workshop on High-dimensional Learning Dynamics (HiLD)

Provable Benefit of Random Permutations over Uniform Sampling in Stochastic Coordinate Descent
Donghwa Kim, Jaewook Lee, Chulhee Yun
ICML 2025

Understanding Sharpness Dynamics in NN Training with a Minimalist Example: The Effects of Dataset Difficulty, Depth, Stochasticity, and More
Geonhui Yoo, Minhak Song, Chulhee Yun
ICML 2025

Incremental Gradient Descent with Small Epoch Counts is Surprisingly Slow on Ill-Conditioned Problems
Yujun Kim, Jaeyoung Cha, Chulhee Yun
ICML 2025

Lightweight Dataset Pruning without Full Training via Example Difficulty and Prediction Uncertainty
Yeseul Cho, Baekrok Shin, Changmin Kang, Chulhee Yun
ICML 2025
ICLR 2025 Workshop on Navigating and Addressing Data Problems for Foundation Models

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
Hyunji Jung, Hanseul Cho, Chulhee Yun
ICLR 2025
Best Paper Award at KAIA Fall Conference 2024 (JKAIA 2024)

Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo
Hyunsu Kim, Giung Nam, Chulhee Yun, Hongseok Yang, Juho Lee
ICLR 2025

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
Hanseul Cho, Jaeyoung Cha, Srinadh Bhojanapalli, Chulhee Yun
ICLR 2025

Does SGD really happen in tiny subspaces?
Minhak Song, Kwangjun Ahn, Chulhee Yun
ICLR 2025
ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning

Stochastic Extragradient with Flip-Flop Shuffling & Anchoring: Provable Improvements
Jiseok Chae, Chulhee Yun, Donghwan Kim
NeurIPS 2024

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity
Baekrok Shin, Junsoo Oh, Hanseul Cho, Chulhee Yun
NeurIPS 2024
ICML 2024 Workshop on Advancing Neural Network Training: Computational Efficiency, Scalability, and Resource Optimization

Provable Benefit of Cutout and CutMix for Feature Learning
Junsoo Oh, Chulhee Yun
NeurIPS 2024 (Spotlight)
ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning
KT Best Paper Award at KAIA Conference 2024 (CKAIA 2024)

Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
Hanseul Cho, Jaeyoung Cha, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, Chulhee Yun
NeurIPS 2024
ICML 2024 Workshop on Long-Context Foundation Models

Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn, Junghyun Lee, Bohan Wang, Huishuai Zhang, Chulhee Yun
ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (Oral)

Fundamental Benefit of Alternating Updates in Minimax Optimization
Jaewook Lee, Hanseul Cho, Chulhee Yun
ICML 2024 (Spotlight)
ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning

Linear attention is (maybe) all you need (to understand transformer optimization)
Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra
ICLR 2024
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (Oral)

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint
Junghyun Lee, Hanseul Cho, Se-Young Yun, Chulhee Yun
NeurIPS 2023

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory
Minhak Song, Chulhee Yun
NeurIPS 2023

PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning
Hojoon Lee, Hanseul Cho, Hyunseung Kim, Daehoon Gwak, Joonkee Kim, Jaegul Choo, Se-Young Yun, Chulhee Yun
NeurIPS 2023

Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
Dongkuk Si, Chulhee Yun
NeurIPS 2023 (Spotlight)
Outstanding Paper Award at KAIA Conference 2023 (CKAIA 2023)

Provable Benefit of Mixup for Finding Optimal Decision Boundaries
Junsoo Oh, Chulhee Yun
ICML 2023

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Jaeyoung Cha, Jaewook Lee, Chulhee Yun
ICML 2023 (Oral)

On the Training Instability of Shuffling SGD with Batch Normalization
David X. Wu, Chulhee Yun, Suvrit Sra
ICML 2023

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization
Hanseul Cho, Chulhee Yun
ICLR 2023
NAVER Outstanding Theory Paper Award at KAIA-NAVER Joint Conference 2022 (JKAIA 2022)

Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond
Chulhee Yun, Shashank Rajput, Suvrit Sra
ICLR 2022 (Oral)

Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
COLT 2021

Provable Memorization via Deep Neural Networks using Sub-linear Parameters
Sejun Park, Jaeho Lee, Chulhee Yun, Jinwoo Shin
COLT 2021
Presented as part of a contributed talk at DeepMath 2020

A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun, Shankar Krishnan, Hossein Mobahi
ICLR 2021
NeurIPS 2020 Workshop on Optimization for Machine Learning (OPT 2020)

Minimum Width for Universal Approximation
Sejun Park, Chulhee Yun, Jaeho Lee, Jinwoo Shin
ICLR 2021 (Spotlight)
Presented as part of a contributed talk at DeepMath 2020

SGD with shuffling: optimal rates without component convexity and large epoch requirements
Kwangjun Ahn, Chulhee Yun, Suvrit Sra
NeurIPS 2020 (Spotlight)

$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
NeurIPS 2020

Low-Rank Bottleneck in Multi-head Attention Models
Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
ICML 2020

Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
ICLR 2020
NeurIPS 2019 Workshop on Machine Learning with Guarantees
Honorable Mention at NYAS Machine Learning Symposium 2020 Poster Awards

Are deep ResNets provably better than linear predictors?
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
NeurIPS 2019

Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
NeurIPS 2019 (Spotlight)

Efficiently testing local optimality and escaping saddles for ReLU networks
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
ICLR 2019

Small nonlinearities in activation functions create bad local minima in neural networks
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
ICLR 2019

Minimax Bounds on Stochastic Batched Convex Optimization
John Duchi, Feng Ruan, Chulhee Yun
COLT 2018

Global optimality conditions for deep neural networks
Chulhee Yun, Suvrit Sra, Ali Jadbabaie
ICLR 2018
NIPS 2017 Workshop on Deep Learning: Bridging Theory and Practice

Face detection using Local Hybrid Patterns
Chulhee Yun, Donghoon Lee, Chang D. Yoo
ICASSP 2015

Teaching

AI.70900 Advanced Deep Learning Theory (2024S, 2025S)

AI.61600 Deep Learning Theory (2022S/F, 2023S/F, 2024F)

Research Group

I direct the Optimization & Machine Learning (OptiML) Laboratory at KAIST. I ambitiously pronounce it as the “Optimal Lab”—although my students may disagree!

PhD Students (all students are in KAIST AI)

Jaeyoung Cha
Hanseul Cho
Junghyun Lee (co-advised with Prof. Se-Young Yun)
Junsoo Oh

Master’s Students (all students are in KAIST AI)

Yeseul Cho
Changmin Kang
Donghwa Kim
Yujun Kim (transferring to Integrated MS/PhD starting Fall 2025)
Chaewon Moon
Baekrok Shin (advancing to PhD starting Fall 2025)
Geonhui Yoo

Undergraduate Students/KAIRI Interns

Beomhan Baek (SNU Math/Business)
Jihwan Kim (SNU Math/CS)
Jueun Kim (Yonsei CS/Math)
Myeongkyu Lee (KAIST EE/Math)
Minseop Shin (KAIST Math)
Jerry Song (KAIST EE)
Minhak Song (KAIST Math)

Former Graduate Students/Notable Former Interns

Jaewook Lee (MSc 2025)
Dongkuk Si (MSc 2024)
Hyunji Jung (2023.06–2024.09)
Prin Phunyaphibarn (2023.01–2024.02)

Service

Conference Area Chair

NeurIPS 2023–2025 (Selected as a Notable AC for NeurIPS 2023)
ICLR 2025

Conference/Workshop Reviewer

ICML 2019–2025 (Selected as a Top Reviewer at ICML 2025)
ICLR 2019–2024
COLT 2020–2024
NeurIPS 2018–2020, 2022
AISTATS 2019
CDC 2018
ICML 2025 Workshop on High-dimensional Learning Dynamics
ICLR 2025 Workshop on Will Synthetic Data Finally Solve the Data Access Problem?
ICLR 2024 Workshop on Privacy Regulation and Protection in Machine Learning

Journal Reviewer

Journal of Machine Learning Research
SIAM Journal on Mathematics of Data Science
Annals of Statistics
IEEE Transactions on Neural Networks and Learning Systems
IEEE Transactions on Information Theory
Mathematical Programming
Neural Networks
Stochastic Systems