Chulhee Yun

Chulhee Yun

Assistant Professor

Kim Jaechul Graduate School of AI

KAIST

My name is Chulhee (I go by Charlie), and I am an assistant professor at KAIST Kim Jaechul Graduate School of AI (KAIST AI). I direct the Optimization & Machine Learning (OptiML) Laboratory at KAIST AI.

I finished my PhD from the Laboratory for Information and Decision Systems at Massachusetts Institute of Technology, where I was fortunate to study under the joint supervision of Prof. Suvrit Sra and Prof. Ali Jadbabaie. Before MIT, I was a master’s student in Electrical Engineering at Stanford University, where I worked with Prof. John Duchi. I finished my undergraduate program in Electrical Engineering at KAIST.

Email: {firstname}.{lastname}@kaist.ac.kr
Phone: +82-2-958-3919
Office: KAIST Seoul Campus Building #9, 9401


For prospective students: If you are curious about what kinds of research I do, please see this interview article (in Korean). I look for self-motivated graduate students with strong mathematical backgrounds. If you are an undergraduate student interested in interning at our lab, consider applying for summer/winter KAIST AI Research Internship (KAIRI) programs.


Interests
  • Deep Learning Theory
  • Optimization
  • Machine Learning Theory
Education
  • PhD in Elec. Eng. & Comp. Sci., 2016–2021

    Massachusetts Institute of Technology

  • MSc in Electrical Engineering, 2014–2016

    Stanford University

  • BSc in Electrical Engineering, 2007–2014

    KAIST

News

[Sep 2024] Four papers got accepted to NeurIPS 2024. Among them, a paper on the theoretical analysis of Cutout/CutMix data augmentation was selected for a spotlight presentation!
[Aug 2024] I will be serving as an Area Chair for ICLR 2025!
[Jun 2024] One paper got accepted to the ICML 2024 Workshop on Long-Context Foundation Models and another paper to appear at the ICML 2024 Workshop on Advancing Neural Network Training: Computational Efficiency, Scalability, and Resource Optimization.
[Jun 2024] Three papers got accepted to the ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning.
[May 2024] Our paper on the benefit of alternating updates in minimax optimization got accepted to ICML 2024! (update: the paper was selected as a spotlight paper!)

Publications

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count  arXiv
arXiv preprint
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity  Paper arXiv
NeurIPS 2024
ICML 2024 Workshop on Advancing Neural Network Training: Computational Efficiency, Scalability, and Resource Optimization
Provable Benefit of Cutout and CutMix for Feature Learning  Paper arXiv
NeurIPS 2024 (Spotlight)
ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning
KT Best Paper Award at KAIA Conference 2024 (CKAIA 2024)
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure  Paper arXiv
NeurIPS 2024
ICML 2024 Workshop on Long-Context Foundation Models
Does SGD really happen in tiny subspaces?  arXiv
ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults  arXiv
ICML 2024 Workshop on High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (Oral)
Fundamental Benefit of Alternating Updates in Minimax Optimization  Paper arXiv
ICML 2024 (Spotlight)
ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning
Linear attention is (maybe) all you need (to understand transformer optimization)  Paper arXiv
ICLR 2024
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (Oral)
Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint  Paper arXiv
NeurIPS 2023
PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning  Paper arXiv
NeurIPS 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima  Paper arXiv
NeurIPS 2023 (Spotlight)
Outstanding Paper Award at KAIA Conference 2023 (CKAIA 2023)
Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond  Paper arXiv
ICML 2023 (Oral)
SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization  Paper arXiv
ICLR 2023
NAVER Outstanding Theory Paper Award at KAIA-NAVER Joint Conference 2022 (JKAIA 2022)
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond  Paper arXiv
ICLR 2022 (Oral)
Provable Memorization via Deep Neural Networks using Sub-linear Parameters  Paper arXiv
COLT 2021
Presented as part of a contributed talk at DeepMath 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks  Paper arXiv
ICLR 2021
NeurIPS 2020 Workshop on Optimization for Machine Learning (OPT 2020)
Minimum Width for Universal Approximation  Paper arXiv
ICLR 2021 (Spotlight)
Presented as part of a contributed talk at DeepMath 2020
$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers  Paper arXiv
NeurIPS 2020
Low-Rank Bottleneck in Multi-head Attention Models  Paper arXiv
ICML 2020
Are Transformers universal approximators of sequence-to-sequence functions?  Paper arXiv
ICLR 2020
NeurIPS 2019 Workshop on Machine Learning with Guarantees
Honorable Mention at NYAS Machine Learning Symposium 2020 Poster Awards
Are deep ResNets provably better than linear predictors?  Paper arXiv
NeurIPS 2019
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity  Paper arXiv
NeurIPS 2019 (Spotlight)
Minimax Bounds on Stochastic Batched Convex Optimization  Paper
COLT 2018
Global optimality conditions for deep neural networks  Paper arXiv
ICLR 2018
NIPS 2017 Workshop on Deep Learning: Bridging Theory and Practice
Face detection using Local Hybrid Patterns  Paper
ICASSP 2015

Teaching

AI709 Advanced Deep Learning Theory (2024S)
AI616 Deep Learning Theory (2022S/F, 2023S/F, 2024F)

Research Group

I direct the Optimization & Machine Learning (OptiML) Laboratory at KAIST. I ambitiously pronounce it as the “Optimal Lab”—although my students may disagree!

PhD Students (all students are in KAIST AI)
Master’s Students (all students are in KAIST AI)
  • Yeseul Cho
  • ChangMin Kang
  • Yujun Kim
  • Jaewook Lee
  • Chaewon Moon
  • Baekrok Shin
  • Geonhui Yoo
Undergraduate Students/KAIRI Interns
  • Beomhan Baek (SNU Math/Business)
  • Yonghoon Dong (Yonsei CS/Math)
  • Donghwa Kim (KAIST Math, joining our lab as a master’s student in Spring 2025)
  • Jihwan Kim (SNU Math/CS)
  • Minseop Shin (KAIST Math)
  • Jerry Song (KAIST EE)
  • Minhak Song (KAIST ISysE/Math)
Former Graduate Students/Notable Former Interns
  • Dongkuk Si (MSc 2024)
  • Prin Phunyaphibarn (2023.01–2024.02)
  • Hyunji Jung (2023.06–2024.09)

Service

Conference Area Chair
  • ICLR 2025
  • NeurIPS 2023–2024 (Selected as a Notable AC for NeurIPS 2023)
Conference/Workshop Reviewer
  • ICLR 2019–2024
  • ICML 2019–2025
  • COLT 2020–2024
  • NeurIPS 2018–2020, 2022
  • AISTATS 2019
  • CDC 2018
  • ICLR 2024 Privacy Regulation and Protection in Machine Learning Workshop
Journal Reviewer
  • Journal of Machine Learning Research
  • SIAM Journal on Mathematics of Data Science
  • Annals of Statistics
  • IEEE Transactions on Neural Networks and Learning Systems
  • IEEE Transactions on Information Theory
  • Mathematical Programming
  • Neural Networks