Chulhee “Charlie” Yun
Chulhee “Charlie” Yun
Home
News
Publications
Teaching
Research Group
Service
1
Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
Hyunji Jung
,
Hanseul Cho
,
Chulhee Yun
Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo
Hyunsu Kim
,
Giung Nam
,
Chulhee Yun
,
Hongseok Yang
,
Juho Lee
Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
Hanseul Cho
,
Jaeyoung Cha
,
Srinadh Bhojanapalli
,
Chulhee Yun
Does SGD really happen in tiny subspaces?
Minhak Song
,
Kwangjun Ahn
,
Chulhee Yun
Stochastic Extragradient with Flip-Flop Shuffling & Anchoring: Provable Improvements
Jiseok Chae
,
Chulhee Yun
,
Donghwan Kim
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity
Baekrok Shin
,
Junsoo Oh
,
Hanseul Cho
,
Chulhee Yun
Provable Benefit of Cutout and CutMix for Feature Learning
Junsoo Oh
,
Chulhee Yun
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
Hanseul Cho
,
Jaeyoung Cha
,
Pranjal Awasthi
,
Srinadh Bhojanapalli
,
Anupam Gupta
,
Chulhee Yun
Fundamental Benefit of Alternating Updates in Minimax Optimization
Jaewook Lee
,
Hanseul Cho
,
Chulhee Yun
Linear attention is (maybe) all you need (to understand transformer optimization)
Kwangjun Ahn
,
Xiang Cheng
,
Minhak Song
,
Chulhee Yun
,
Ali Jadbabaie
,
Suvrit Sra
»
Cite
×