voiceHome-2 corpus

A corpus dedicated to distant-microphone speech processing in domestic environments



by N. Bertin1, E. Camberlein1, R. Lebarbenchon1, E. Vincent2, S. Sivasankaran2, I. Illina2 and F. Bimbot1

1IRISA - CNRS UMR 6074, Rennes, France

2Inria, Villers-lès-Nancy, F-54600, France


Purpose

This corpus includes reverberated, noisy speech signals spoken by 12 native French talkers in 4 houses (3 rooms per house) and recorded by an 8-microphone device at various angles and distances and in various noise conditions.

This corpus stands apart from other corpora in the field by the number of rooms and homes considered by the diversity of acoustic conditions recorded and by the facts that it is publicly available at no cost.

Corpus download

VoiceHome-2 corpus and its documentation can be downloaded here.

Baselines download

We provide here baseline software and tools to perform source localization, speech enhancement (via source separation) and automatic speech recognition on the corpus. These baseline scripts allow to reproduce the results presented in the publication cited below.

Related softwares

The baseline scripts use these two related toolboxes, also downloadable from their respective homepages.

Reference

Terms of use

You may exploit the corpus for a non-commercial scientific purpose provided you mention it in any written work or software you derive from its use. Within a published article, paper or report, the corpus must appear in the bibliographical references as:

Speaker records diffusion consent

All participants have given an informed and signed consent about public diffusion of recorded sentences.

Contact

For any question, please contact: nancy.bertin@irisa.fr