PI: Daniel Haider
Duration: Jan 2022 - Dec 2024
Funding: DOC Fellowship Program of the Austrian Academy of Sciences (A-26335)
Advisors and Collaborators: Peter Balazs and Martin Ehler (University of Vienna)
Modern deep learning models that include training deep artificial neural networks (DNNs) have widely been evolving towards “end-to-end” designs. This makes them less and less accessible and interpretable and as these black-box models rapidly find their way into our everyday lives, this has been becoming a severe issue. To address this, we have to identify rigorous mathematical formulations of the problems and conduct solid analyses to maintain control of what we casually call "artificial intelligence" in the future.
The project aims to contribute to exactly that by employing the potential of a frame-theoretic approach for the analysis of neural networks in full generality.
Assuming a general Hilbert space setting, we may consider a layer of a neural network to be the analysis operator of an associated sequence of functions from our space, followed by a piece-wise non-linearity. Despite neural networks being non-linear operators in finite dimensions, we are convinced that for the purpose of building a solid theory, it is very beneficial to consider infinite-dimensional settings as well. For example, the Hilbert space of infinite sequences of finite energy is closely tied to the theory of signal processing, hence also a very appropriate setting for deep learning.
For audio applications, the layers are often based on convolution, such that the linear part of such a layer can be interpreted as a filter bank decomposition. Since frames have a long tradition in audio engineering, also in this setting, a frame-theoretical approach is highly suitable.
In a finite-dimensional setting, the frame associated with a network layer is just the collection of row vectors of the matrix representation of the linear transform. Here, finite frame theory provides efficient notation and perspectives that make rigorous stability analyses and the usage of duality concepts possible.
In applications, our approach makes particular sense at the early stage of a neural network, where it is assumed that the input data is transformed and prepared for the rest of the learning procedure. At this stage usually, also dimensionality is increased, in other words, redundancy is present. Exploiting the properties of a redundant frame, the effect of several classes of non-linearities on the frame analysis operator can be investigated well, yielding important stability results and reconstruction methods for injective layers of a neural network. A big emphasis here lies on the ReLU function.
In general, there are various algorithms from frame theory for the efficient computation of condition numbers, approximate and exact duals, etc.
All in all, the project seeks to deeply connect the theory of frames with the theory of neural networks so that this mathematical paradigm eventually gets established in the deep learning community one day.