“Less is more”, once the foundational motto of minimalist art, is making its way into artificial intelligence. After a maximalist decade of larger computers training larger neural networks on larger datasets (2012-2022), a countertrend arises. What if human-level performance could be achieved with less computing, less memory, and less supervision? In deep learning, the research… Continue reading MuReNN: Multi-Resolution Neural Networks
Author: Vincent Lostanlen
Explainable audio classification of playing techniques with layerwise relevance propagation @ IEEE ICASSP
Deep convolutional networks (convnets) in the time-frequency domain can learn an accurate and fine-grained categorization of sounds. For example, in the context of music signal analysis, this categorization may correspond to a taxonomy of playing techniques: vibrato, tremolo, trill, and so forth. However, convnets lack an explicit connection with the neurophysiological underpinnings of musical timbre perception. In this article, we propose a data-driven approach to explain audio classification in terms of physical attributes in sound production. We borrow from current literature in “explainable AI” (XAI) to study the predictions of a convnet which achieves an almost perfect score on a challenging task: i.e., the classification of five comparable real-world playing techniques from 30 instruments spanning seven octaves. Mapping the signal into the carrier-modulation domain using scattering transform, we decompose the networks’ predictions over this domain with layer-wise relevance propagation. We find that regions highly-relevant to the predictions localized around the physical attributes with which the playing techniques are performed.
L’innovation peut-elle conduire à plus de sobriété dans la musique enregistrée ?
Une table ronde sur les enjeux écologiques de la musique enregistrée, organisée par le Centre national de la musique (CNM) à l’occasion des Rencontres de l’innovation dans la musique 2023. Cette table ronde coïncide avec la sortie du recueil “Musique et données”. Avec : Modération : Emily Gonneau – Causa
Écologie de la musique numérique
Un article en langue française dans le dernier recueil du Centre national de la musique (CNM). En voici le résumé : Il est temps de renoncer à l’utopie d’une musique intégralement disponible, pour tout le monde, partout, tout de suite. Au contraire, le flux audio musical est matérialisé dans ses objets, limité dans ses architectures… Continue reading Écologie de la musique numérique
Announcing Kymatio tutorial @ ISMIR
We will present a tutorial on Kymatio at the International Society for Music Information Retrieval (ISMIR) Conference, held in Milan on November 5-9, 2023.
Kymatio: Deep Learning meets Wavelet Theory for Music Signal Processing
PETREL: Platform for Environmental Tracking of Renewable Energy and wildLife
Malgré leur intérêt évident dans la transition énergétique, les infrastructures productrices d’énergies renouvelables marines ont un impact sur la faune locale qui reste difficile à quantifier. Dans cecontexte, le projet PETREL (Platform for Environmental Tracking of Renewable Energy and wildLife) vise à inventer une solution pérenne et éco-responsable au problème du suivi environnemental des installations… Continue reading PETREL: Platform for Environmental Tracking of Renewable Energy and wildLife
Perceptual–Physical–Sound Matching @ IEEE ICASSP
Sound matching algorithms seek to approximate a target waveform by parametric audio synthesis. Deep neural networks have achieved promising results in matching sustained harmonic tones. However, the task is more challenging when targets are nonstationary and inharmonic, e.g., percussion. We attribute this problem to the inadequacy of loss function. On one hand, mean square error in the parametric domain, known as “P-loss”, is simple and fast but fails to accommodate the differing perceptual significance of each parameter. On the other hand, mean square error in the spectrotemporal domain, known as “spectral loss”, is perceptually motivated and serves in differentiable digital signal processing (DDSP). Yet, spectral loss is a poor predictor of pitch intervals and its gradient may be computationally expensive; hence a slow convergence. Against this conundrum, we present Perceptual-Neural-Physical loss (PNP). PNP is the optimal quadratic approximation of spectral loss while being as fast as P-loss during training. We instantiate PNP with physical modeling synthesis as decoder and joint time-frequency scattering transform (JTFS) as spectral representation. We demonstrate its potential on matching synthetic drum sounds in comparison with other loss functions.
Favet Neptunus eunti
Welcome to our website. We are the special interest group on Audio at the Laboratoire des Sciences du Numérique de Nantes (France), or Audio@LS2N for short.
Bienvenue sur notre site. Nous sommes le groupe de travail sur l’audio du Laboratoire des sciences du numérique de Nantes ou Audio @ LS2N.