Repository | Series | Book | Chapter

225554

Interactive sound texture synthesis through semi-automatic user annotations

Diemo SchwarzBaptiste Caramiaux

pp. 372-392

Abstract

We present a way to make environmental recordings controllable again by the use of continuous annotations of the high-level semantic parameter one wishes to control, e.g. wind strength or crowd excitation level. A partial annotation can be propagated to cover the entire recording via cross-modal analysis between gesture and sound by canonical time warping (CTW). The annotations serve as a descriptor for lookup in corpus-based concatenative synthesis in order to invert the sound/annotation relationship. The workflow has been evaluated by a preliminary subject test and results on canonical correlation analysis (CCA) show high consistency between annotations and a small set of audio descriptors being well correlated with them. An experiment of the propagation of annotations shows the superior performance of CTW over CCA with as little as 20 s of annotated material.

Publication details

Published in:

Aramaki Mitsuko, Derrien Olivier, Kronland-Martinet Richard, Ystad Sølvi (2014) Sound, music, and motion: 10th international symposium, CMMR 2013, Marseille, France, October 15-18, 2013. revised selected papers. Dordrecht, Springer.

Pages: 372-392

DOI: 10.1007/978-3-319-12976-1_23

Full citation:

Schwarz Diemo, Caramiaux Baptiste (2014) „Interactive sound texture synthesis through semi-automatic user annotations“, In: M. Aramaki, O. Derrien, R. Kronland-Martinet & S. Ystad (eds.), Sound, music, and motion, Dordrecht, Springer, 372–392.