Historique: Two-channel mixtures of speech and real-world background noise

Comparaison de la version 78 à la version 89


@@ -Lignes: 3-6 changées en +Lignes: 3-11 @@

This task aims at evaluating source separation and denoising techniques in the context of speech enhancement by merging two datasets: the [http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise|SiSEC 2010 noisy speech dataset] and the [http://www.dcs.shef.ac.uk/spandh/chime/PCC/datasets.html|CHiME corpus]. Both datasets consist of two-channel mixtures of one speech source and real-world background noise, so that algorithms applicable to one dataset are applicable to the other without additional effort. The source separation results obtained over the latter dataset will be analyzed in line of the speech recognition results obtained over that dataset as part of the [http://www.dcs.shef.ac.uk/spandh/chime/challenge.html|CHiME Challenge].
+
+
+ !!Results
+
+ __See the results over [http://www.irisa.fr/metiss/SiSEC11/noise/results_test.html|test] and [http://www.irisa.fr/metiss/SiSEC11/noise/results_dev.html|development] data__


@@ -Lignes: 56-65 changées en +Lignes: 61-72 @@

The data consists of 136 WAV audio files that can be imported in Matlab using the wavread command and 10 text files. These files are named as follows:
- * -+test_<env>_<cond>_<take>_src.wav+-: single-channel speech signal
* -+test_<env>_<cond>_<take>_sim.wav+-: two-channel spatial image of the speech source
* -+test_<env>_<cond>_<take>_noi.wav+-: two-channel spatial image of the background noise
* -+test_<env>_<cond>_<take>_mix.wav+-: two-channel mixture signal
* -+test_<env>_<cond>_<take>_DOA.txt+-: DOA of the speech source (see the [http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise|SiSEC 2010 wiki] for the convention adopted to measure DOA)
+ * -+dev_<env>_<cond>_<take>_src.wav+-: single-channel speech signal
* -+dev_<env>_<cond>_<take>_sim.wav+-: two-channel spatial image of the speech source
* -+dev_<env>_<cond>_<take>_noi.wav+-: two-channel spatial image of the background noise
* -+dev_<env>_<cond>_<take>_mix.wav+-: two-channel mixture signal
* -+dev_<env>_<cond>_<take>_DOA.txt+-: DOA of the speech source (see the [http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise|SiSEC 2010 wiki] for the convention adopted to measure DOA)
Since the source DOAs were measured geometrically in the -+Su+- and -+Ca+- environments, they might contain a measurement error up to a few degrees; on the contrary, there is no such error in the -+Sq+- environment.
+
+ The mixtures dev_Ca1_Co_A_mix.wav and dev_Ca1_Co_B_mix.wav are identical (this is a mistake that will be corrected in future evaluations).

Entrants wishing to exploit the context of each sentence in the domestic environment database can also __download the corresponding [http://www.irisa.fr/metiss/SiSEC11/noise/dev_embedded.zip|5 min recordings] (86 MB)__ (same nomenclature as above).

@@ -Lignes: 78-82 changées en +Lignes: 85-92 @@
#__speech and noise spatial image estimation__: decompose the mixture signal into two two-channel signals corresponding to the speech source and the background noise
- Reference software will eventually be provided for each of these tasks.
+ Participants are welcome to use some of the Matlab reference software below to build their own algorithms: />* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=1|stft_multi.m]: multichannel STFT<br />* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=9|istft_multi.m]: multichannel inverse STFT
* [http://s
isec2011.wiki.irisa.fr/tiki-download_file.php?fileId=3|example_denoising.m]: TDOA estimation by GCC-PHATmax, ML target and noise variance estimation under a diffuse noise model, and multichannel Wiener filtering

Due to the specific construction of the dataset, at least four strategies may be employed to process the domestic environment mixtures:

@@ -Lignes: 92-96 changées en +Lignes: 102-119 @@
Each participant is asked to submit the results of his/her algorithm for task 2 and/or 3 over all or part of the mixtures in the development dataset and the test dataset. The results for task 1 may also be submitted if possible.
- In addition, each participant is asked to provide basic information about his/her algorithm (bibliographical reference, __employed processing strategy__, etc) and to declare its average running time, expressed in seconds per test excerpt and per GHz of CPU.
+ Each participant should make his results available online in the form of a tarball with the following file naming convention:
* -+test_<env>_<con
d>_<take>_src.wav+-: single-channel speech signal
* -+test_<env>_<con
d>_<take>_sim.wav+-: two-channel spatial image of the speech source
* -+test_<e
nv>_<cond>_<take>_noi.wav+-: two-channel spatial image of the background noise
* -+test_<env>_<cond>_<take>_DOA.txt+-: DOA of the speech source

For the domestic environment dataset
, the CHiME file naming convention is also acceptable.

E
ach participant should then send an email to "araki.shoko (at) lab.ntt.co.jp", "nesta (a) fbk.eu" and &quot;emmanuel.vincent (at) inria.fr&quot; providing:
* contact information (nam
e, affiliation)
*
basic information about his/her algorithm, including the __employed processing strategy__ among the four strategies outlined above, its average running time (in seconds per test excerpt and per GHz of CPU) and a bibliographical reference if possible
* the URL of the tarball

The submitted audio files will be made available on a website under the terms of the Licensing section below
.


@@ -Lignes: 101-105 changées en +Lignes: 124-128 @@
The estimated speaker DOAs in task 1 will be evaluated in terms of absolute difference with the true DOAs.
- The estimated speech signals in task 2 will be evaluated via the energy ratio criteria defined in the [http://bass-db.gforge.inria.fr/bss_eval/|BSS_EVAL] toolbox allowing arbitrary filtering between the estimated source and the true source and via the perceptually-motivated criteria in the [http://bass-db.gforge.inria.fr/peass/PEASS-Software.html|PEASS] toolkit.
+ The estimated speech signals in task 2 will be evaluated via the energy ratio criteria defined in the [http://bass-db.gforge.inria.fr/bss_eval/|BSS_EVAL] toolbox allowing arbitrary filtering between the estimated source and the true source.

The estimated speech and noise spatial image signals in task 3 will be evaluated via the energy ratio criteria introduced for the [http://www.irisa.fr/metiss/SASSEC07/?show=criteria|Stereo Audio Source Separation Evaluation Campaign] and via the perceptually-motivated criteria in the [http://bass-db.gforge.inria.fr/peass/PEASS-Software.html|PEASS] toolkit.

@@ -Lignes: 108-116 changées en +Lignes: 131-140 @@

The above performance criteria and benchmarks are respectively implemented in
- * [tiki-download_userfile.php?fileId=2|bss_eval_source_denoising.m]
* [tiki-download_userfile.php?fileId=1|bss_eval_images_nosort.m]
+ * [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=2|bss_eval_source_denoising.m]
* [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=1|bss_eval_images_nosort.m]
* [http://bass-db.gforge.inria.fr/peass/PEASS-Software-v1.0.zip|PEASS]
* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=15|sep_ibm.m]
* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=16|Cochleagram Toolbox]
+ An example use is given in [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=3|example_denoising.m].


Historique

Légende : v=afficher, c=comparer, d=différences
Date UtilisateurNote à propos de cette modification Version Action
mer. 21 de déc., 2011 16:16 CET admin   89
En cours
 v
jeu. 17 de nov., 2011 14:31 CET admin   88  v  c  d  
mer. 16 de nov., 2011 08:55 CET admin   87  v  c  d  
mar. 20 de sept., 2011 04:09 CEST admin   86  v  c  d  
ven. 12 de août, 2011 11:54 CEST admin   85  v  c  d  
ven. 12 de août, 2011 11:43 CEST admin   84  v  c  d  
ven. 12 de août, 2011 11:41 CEST admin   83  v  c  d  
ven. 12 de août, 2011 11:33 CEST admin   82  v  c  d  
ven. 12 de août, 2011 11:32 CEST admin   81  v  c  d  
ven. 12 de août, 2011 11:30 CEST admin   80  v  c  d  
ven. 12 de août, 2011 08:27 CEST admin   79  v  c  d  
jeu. 11 de août, 2011 15:17 CEST admin   78  v  c  d  

Menu

Rechercher avec Google

 
sisec2011.wiki.irisa.fr
WWW