[10-14] Optimization, Generalization and Implicit bias of Gradient Methods in Deep Learning----中國科學院軟件研究所

[10-14] Optimization, Generalization and Implicit bias of Gradient Methods in Deep Learning

文章來源： | 發布時間：2022-10-11 | 【打印】【關閉】

Title:Optimization, Generalization and Implicit bias of Gradient Methods in Deep Learning

Speaker:李建（清華大學交叉信息研究院）

Time:10月14日（周五）上午10:00-11:30

Venue:線下：5號樓三層 334報告廳，線上：騰訊會議 208849802

Abstract:Deep learning has enjoyed huge empirical success in recent years. Although training a deep neural network is a highly nonconvex optimization problem,simple (stochastic) gradient methods are able to produce good solutions that minimize the training error, and more surprisingly, can generalize well to out-of sample data, even when the number of parameters is significantly larger than the amount of training data. It is known that the optimization algorithms (various gradient-based methods) contribute greatly to the generalization properties of deep learning. However, recently, researchers have found that gradient methods (even gradient descent) may not converge to a stationary point, the loss graduately decreases but not necessarily monotonically, and the sharpness of the loss landscape (i.e., the max eigenvalue of the Hessian) may oscillate, entering a regime called edge of stability. These behaviors are inconsistent with several classical presumptions widely studied in the field of optimization. Moreover, what bias is introduced by the gradient-based algorithms in neural network training? What characteristics of the training ensures good generalization in deep learning? In this talk, we investigate these question from the perspective of the gradient based optimization methods. In particular, we attempt to explain some of the behaviors of the optimization trajectory (e.g., edge of stability), prove new generalization bounds and investigate the implicit bias of various gradient methods.

Bio:李建目前是清華大學交叉信息研究院（長聘）副教授，博士生導師。他在中山大學取得的學士學位和復旦大學取得的碩士學位，馬里蘭大學博士畢業。他的研究興趣主要包括算法設計與分析，機器學習，數據庫，金融科技。他已經在主流國際會議和雜志上發表了100余篇論文等，并獲得了 VLDB 2009 和 ESA 2010 的最佳論文獎，ICDT 2017最佳新人獎，清華221基礎研究青年人才支持計劃，教育部新世紀人才支持計劃，國家自然科學基金優秀青年基金。他主持并參與了多項科研項目，包括自然科學基金青年基金，面上項目，中以國際合作項目，青年973計劃等，以及多個企業級合作項目，包括螞蟻金服、華泰證券、易方達、微軟、百度、滴滴等。