We will present a tutorial on Kymatio at the International Society for Music Information Retrieval (ISMIR) Conference, held in Milan on November 5-9, 2023.
Kymatio: Deep Learning meets Wavelet Theory for Music Signal Processing
The machine listening research group in Nantes, France
We will present a tutorial on Kymatio at the International Society for Music Information Retrieval (ISMIR) Conference, held in Milan on November 5-9, 2023.
Kymatio: Deep Learning meets Wavelet Theory for Music Signal Processing
Malgré leur intérêt évident dans la transition énergétique, les infrastructures productrices d’énergies renouvelables marines ont un impact sur la faune locale qui reste difficile à quantifier. Dans cecontexte, le projet PETREL (Platform for Environmental Tracking of Renewable Energy and wildLife) vise à inventer une solution pérenne et éco-responsable au problème du suivi environnemental des installations… Continue reading PETREL: Platform for Environmental Tracking of Renewable Energy and wildLife
Sound matching algorithms seek to approximate a target waveform by parametric audio synthesis. Deep neural networks have achieved promising results in matching sustained harmonic tones. However, the task is more challenging when targets are nonstationary and inharmonic, e.g., percussion. We attribute this problem to the inadequacy of loss function. On one hand, mean square error in the parametric domain, known as “P-loss”, is simple and fast but fails to accommodate the differing perceptual significance of each parameter. On the other hand, mean square error in the spectrotemporal domain, known as “spectral loss”, is perceptually motivated and serves in differentiable digital signal processing (DDSP). Yet, spectral loss is a poor predictor of pitch intervals and its gradient may be computationally expensive; hence a slow convergence. Against this conundrum, we present Perceptual-Neural-Physical loss (PNP). PNP is the optimal quadratic approximation of spectral loss while being as fast as P-loss during training. We instantiate PNP with physical modeling synthesis as decoder and joint time-frequency scattering transform (JTFS) as spectral representation. We demonstrate its potential on matching synthetic drum sounds in comparison with other loss functions.
Welcome to our website. We are the special interest group on Audio at the Laboratoire des Sciences du Numérique de Nantes (France), or Audio@LS2N for short.
Bienvenue sur notre site. Nous sommes le groupe de travail sur l’audio du Laboratoire des sciences du numérique de Nantes ou Audio @ LS2N.