1

Provable Benefit of Cutout and CutMix for Feature Learning

Junsoo Oh, Chulhee Yun

Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure

Hanseul Cho, Jaeyoung Cha, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, Chulhee Yun

Fundamental Benefit of Alternating Updates in Minimax Optimization

Jaewook Lee, Hanseul Cho, Chulhee Yun

Linear attention is (maybe) all you need (to understand transformer optimization)

Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint

Junghyun Lee, Hanseul Cho, Se-Young Yun, Chulhee Yun

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

Minhak Song, Chulhee Yun

PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning

Hojoon Lee, Hanseul Cho, Hyunseung Kim, Daehoon Gwak, Joonkee Kim, Jaegul Choo, Se-Young Yun, Chulhee Yun

Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima

Dongkuk Si, Chulhee Yun

Provable Benefit of Mixup for Finding Optimal Decision Boundaries

Junsoo Oh, Chulhee Yun

Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

Jaeyoung Cha, Jaewook Lee, Chulhee Yun