Dictionary Learning for Sparse Audio Inpainting

Georg Tauböck, Shristi Rajbamshi, and Peter Balazs

This is the accompanying webpage for the article

"Dictionary learning for sparse audio inpainting," in IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 1, pp. 104–119, 2021.

Abstract: The objective of audio inpainting is to fill a gap in an audio signal. This is ideally done by reconstructing the original signal or, at least, by inferring a meaningful surrogate signal. We propose a novel approach applying sparse modeling in the time-frequency (TF) domain. In particular, we devise a dictionary learning technique which learns the dictionary from reliable parts around the gap with the goal to obtain a signal representation with increased TF sparsity. This is based on a basis optimization technique to deform a given Gabor frame such that the sparsity of the analysis coefficients of the resulting frame is maximized. Furthermore, we modify the SParse Audio INpainter (SPAIN) for both the analysis and the synthesis model such that it is able to exploit the increased TF sparsity and—in turn—benefits from dictionary learning. Our experiments demonstrate that the developed methods achieve significant gains in terms of signal-to-distortion ratio (SDR) and objective difference grade (ODG) compared with several state-of-the-art audio inpainting techniques.

Manuscript: [pdf]

Software: MATLAB Simulation Scripts

Zurück

Name	Zweck	Speicherdauer	Typ	Anbieter
CookieConsent	Speichert Ihre Einwilligung zur Verwendung von Cookies.	1 Jahr	HTML	Web Consent
fe_typo_user	Ordnet Ihren Browser einer Session auf dem Server zu. Dies beeinflusst nur die Inhalte, die Sie sehen und wird von uns nicht ausgewertet oder weiterverarbeitet.	-	HTTP	Web User

Name	Zweck	Speicherdauer	Typ	Anbieter
_pk_id	Wird verwendet, um ein paar Details über den Benutzer wie die eindeutige Besucher-ID zu speichern.	13 Monate	HTML	Matomo-id
_pk_ref	Wird benutzt, um die Informationen der Herkunftswebsite des Benutzers zu speichern.	6 Monate	HTML	Matomo-ref
_pk_ses	Kurzzeitiges Cookie, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo-ses
_pk_cvar	Kurzzeitiges Cookie, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo-cvar
_pk_hsr	Kurzzeitiges Cookie, um vorübergehende Daten des Besuchs zu speichern.	30 Minuten	HTML	Matomo

Name	Zweck	Speicherdauer	Typ	Anbieter
YouTube	Es wird eine Verbindung mit YouTube hergestellt, um Videos anzuzeigen.	-	Verbindung	YouTube
SoundCloud	Es wird eine Verbindung mit SoundCloud hergestellt, um Audio-Dateien abzuspielen.	-	Verbindung	SoundCloud
Twitter	Es wird eine Verbindung mit Twitter hergestellt, um Tweets anzuzeigen.	-	missing translation: type.	Twitter

Dictionary Learning for Sparse Audio Inpainting

Georg Tauböck, Shristi Rajbamshi, and Peter Balazs

Kontakt

Presse

Institut für Schallforschung