Overview

Abstract

Deep neural networks have brought transformative changes in areas such as image recognition and healthcare, yet their underlying theoretical principles remain somewhat elusive. The main emphasis of this presentation is to explore the theoretical dimensions of deep neural networks, with a special focus on evaluating their approximation capabilities. We begin by assessing the approximation errors in ReLU networks when they attempt to approximate various target functions, including (Lipschitz) continuous functions, polynomials, and smooth functions. Utilizing the concept of Vapnik-Chervonenkis dimension, we demonstrate that these approximation errors are nearly optimal. To achieve better approximation accuracy, we propose several innovative methods, including the introduction of new activation functions, the pre-setting of certain parameters, and the sharing of parameters. Specifically, we have developed a simple and computable activation function, named EUAF, which enables a fixed-size EUAF network to approximate continuous functions with arbitrary accuracy. Additionally, drawing inspiration from the widespread use of ReLU, we investigate its connections to other activation functions. We broaden the scope of our approximation results, initially specific to ReLU networks, to include a wide array of activation functions such as Sigmoid, ELU, and GELU.

Brief Biography

Shijun Zhang is a Phillip Griffiths Assistant Research Professor at Duke University. He received his Ph.D. degree from National University of Singapore in 2021. His primary interest is in contributing to a deeper understanding of deep learning. The majority of his current research focuses on the approximation theory of deep neural networks.

Presenters

Dr. Shijun Zhang, Department of Mathematics, Duke University