Filter by type:

Sort by year:

The Shape of RemiXXXes to Come: Audio Texture Synthesis with Time–frequency Scattering

Conference paper
Vincent Lostanlen, Florian Hecker
Proceedings of the International Conference on Digital Audio Effects (DAFx), 2019
Publication year: 2019

This article explains how to apply timefrequency scattering, a convolutional operator extracting modulations in the timefrequency domain at different rates and scales, to the re-synthesis and manipulation of audio textures.

Fourier and the science of today

Fourier at the heart of computer music: From harmonic sounds to texture

Journal paper
Vincent Lostanlen, Joakim Andén, Mathieu Lagrange
Comptes Rendus Physique, volume 20, issue 5, pp. 461-473
Publication year: 2019

Beyond the scope of thermal conduction, Joseph Fourier’s treatise on the Analytical Theory of Heat (1822) profoundly altered our understanding of acoustic waves. It posits that any function of unit period can be decomposed into a sum of sinusoids, whose respective contributions represent some essential property of the underlying periodic phenomenon. In acoustics, such a decomposition reveals the resonant modes of a freely vibrating string. The introduction of Fourier series thus opened new research avenues on the modeling of musical timbre—a topic that was to become of crucial importance in the 1960s with the advent of computer-generated sounds. This article proposes to revisit the scientific legacy of Joseph Fourier through the lens of computer music research. We first discuss how the Fourier series marked a paradigm shift in our understanding of acoustics, supplanting the theory of consonance of harmonics in the Pythagorean monochord. Then, we highlight the utility of Fourier’s paradigm via three practical problems in analysis–synthesis: the imitation of musical instruments, frequency transposition, and the generation of audio textures. Interestingly, each of these problems involves a different perspective on time–frequency duality, and stimulates a multidisciplinary interplay between research and creation that is still ongoing.

Adaptive Time–Frequency Scattering for Periodic Modulation Recognition in Music Signals

Conference paper
Changhong Wang, Vincent Lostanlen, Emmanouil Benetos, Elaine Chew
Proceedings of the International Society on Music Information Retrieval (ISMIR) Conference
Publication year: 2019

Vibratos, tremolos, trills, and flutter-tongue are techniques frequently found in vocal and instrumental music. A common feature of these techniques is the periodic modulation in the time–frequency domain. We propose a representation based on time–frequency scattering to model the interclass variability for fine discrimination of these periodic modulations. Time–frequency scattering is an instance of the scattering transform, an approach for building invariant, stable, and informative signal representations. The proposed representation is calculated around the wavelet subband of maximal acoustic energy, rather than over all the wavelet bands. To demonstrate the feasibility of this approach, we build a system that computes the representation as input to a machine learning classifier. Whereas previously published datasets for playing technique analysis focus primarily on techniques recorded in isolation, for ecological validity, we create a new dataset to evaluate the system. The dataset, named CBF-periDB, contains full-length expert performances on the Chinese bamboo flute that have been thoroughly annotated by the players themselves. We report F-measures of 99% for flutter-tongue, 82% for trill, 69% for vibrato, and 51% for tremolo detection, and provide explanatory visualisations of scattering coefficients for each of these techniques.

Wavelet Scattering on the Pitch Spiral

Conference paper
Vincent Lostanlen, Stéphane Mallat
Proceedings of the International Conference on Digital Audio Effects (DAFx)
Publication year: 2015

We present a new representation of harmonic sounds that linearizes the dynamics of pitch and spectral envelope, while remaining stable to deformations in the timefrequency plane. It is an instance of the scattering transform, a generic operator which cascades wavelet convolutions and modulus nonlinearities. It is derived from the pitch spiral, in that convolutions are successively performed in time, log-frequency, and octave index. We give a closed-form approximation of spiral scattering coefficients for a nonstationary generalization of the harmonic sourcefilter model.

Transformée en scattering sur la spirale temps–chroma–octave

Conference paper
Vincent Lostanlen, Stéphane Mallat
Actes du colloque GRETSI, 2015
Publication year: 2015

We introduce a scattering representation for the analysis and classification of sounds. It is locally translation-invariant, stable to deformations in time and frequency, and has the ability to capture harmonic structures. The scattering representation can be interpreted as a convolutional neural network which cascades a wavelet transform in time and along a harmonic spiral. We study its application for the analysis of the deformations of the source–filter model.

Joint Time–frequency Scattering for Audio Classification

Conference paper
Joakim Andén, Vincent Lostanlen, and Stéphane Mallat
Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP)
Publication year: 2015

We introduce the joint time–frequency scattering transform, a time shift invariant descriptor of time–frequency structure for audio classification. It is obtained by applying a two-dimensional wavelet transform in time and log-frequency to a time–frequency wavelet scalogram. We show that this descriptor successfully characterizes complex time–frequency phenomena such as time-varying filters and frequency modulated excitations. State-of-the-art results are achieved for signal reconstruction and phone segment classification on the TIMIT dataset.