Georg Tauböck, Shristi Rajbamshi, and Peter Balazs

This is the accompanying webpage for the article

"Dictionary learning for sparse audio inpainting," in IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 1, pp. 104–119, 2021.

Abstract: The objective of audio inpainting is to fill a gap in an audio signal. This is ideally done by reconstructing the original signal or, at least, by inferring a meaningful surrogate signal. We propose a novel approach applying sparse modeling in the time-frequency (TF) domain. In particular, we devise a dictionary learning technique which learns the dictionary from reliable parts around the gap with the goal to obtain a signal representation with increased TF sparsity. This is based on a basis optimization technique to deform a given Gabor frame such that the sparsity of the analysis coefficients of the resulting frame is maximized. Furthermore, we modify the SParse Audio INpainter (SPAIN) for both the analysis and the synthesis model such that it is able to exploit the increased TF sparsity and—in turn—benefits from dictionary learning. Our experiments demonstrate that the developed methods achieve significant gains in terms of signal-to-distortion ratio (SDR) and objective difference grade (ODG) compared with several state-of-the-art audio inpainting techniques.

Manuscript: [pdf] 

Software: MATLAB Simulation Scripts