voiceHome-2 corpus

A corpus dedicated to distant-microphone speech processing in domestic environments

by N. Bertin1, E. Camberlein1, R. Lebarbenchon1, E. Vincent2, S. Sivasankaran2, I. Illina2 and F. Bimbot1

1IRISA - CNRS UMR 6074, Rennes, France

2Inria, Villers-lès-Nancy, F-54600, France


This corpus includes reverberated, noisy speech signals spoken by 12 native French talkers in 4 houses (3 rooms per house) and recorded by an 8-microphone device at various angles and distances and in various noise conditions.

This corpus stands apart from other corpora in the field by the number of rooms and homes considered by the diversity of acoustic conditions recorded and by the facts that it is publicly available at no cost.

Corpus download

VoiceHome-2 corpus and its documentation can be downloaded here.

Baselines download

We provide here baseline software and tools to perform source localization, speech enhancement (via source separation) and automatic speech recognition on the corpus. These baseline scripts allow to reproduce the results presented in the publication cited below.

Related softwares

The baseline scripts use these two related toolboxes, also downloadable from their respective homepages.


Terms of use

You may exploit the corpus for a non-commercial scientific purpose provided you mention it in any written work or software you derive from its use. Within a published article, paper or report, the corpus must appear in the bibliographical references as:

Speaker records diffusion consent

All participants have given an informed and signed consent about public diffusion of recorded sentences.


For any question, please contact: nancy.bertin@irisa.fr