Historique: Two-channel mixtures of speech and real-world background noise
Comparaison de la version 83 à la version 89
@@ -Lignes: 3-6 changées en +Lignes: 3-11 @@
This task aims at evaluating source separation and denoising techniques in the context of speech enhancement by merging two datasets: the [http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise|SiSEC 2010 noisy speech dataset] and the [http://www.dcs.shef.ac.uk/spandh/chime/PCC/datasets.html|CHiME corpus]. Both datasets consist of two-channel mixtures of one speech source and real-world background noise, so that algorithms applicable to one dataset are applicable to the other without additional effort. The source separation results obtained over the latter dataset will be analyzed in line of the speech recognition results obtained over that dataset as part of the [http://www.dcs.shef.ac.uk/spandh/chime/challenge.html|CHiME Challenge].
+
+ + !!Results + + __See the results over [http://www.irisa.fr/metiss/SiSEC11/noise/results_test.html|test] and [http://www.irisa.fr/metiss/SiSEC11/noise/results_dev.html|development] data__ @@ -Lignes: 62-65 changées en +Lignes: 67-72 @@
* -+dev_<env>_<cond>_<take>_DOA.txt+-: DOA of the speech source (see the [http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Source+separation+in+the+presence+of+real-world+background+noise|SiSEC 2010 wiki] for the convention adopted to measure DOA)
Since the source DOAs were measured geometrically in the -+Su+- and -+Ca+- environments, they might contain a measurement error up to a few degrees; on the contrary, there is no such error in the -+Sq+- environment.
+
+ The mixtures dev_Ca1_Co_A_mix.wav and dev_Ca1_Co_B_mix.wav are identical (this is a mistake that will be corrected in future evaluations). Entrants wishing to exploit the context of each sentence in the domestic environment database can also __download the corresponding [http://www.irisa.fr/metiss/SiSEC11/noise/dev_embedded.zip|5 min recordings] (86 MB)__ (same nomenclature as above). @@ -Lignes: 81-85 changées en +Lignes: 88-92 @@
* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=1|stft_multi.m]: multichannel STFT
* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=9|istft_multi.m]: multichannel inverse STFT
- * [http://sisec2011.wiki.irisa.fr/tiki-download_userfile.php?fileId=3|example_denoising.m]: TDOA estimation by GCC-PHATmax, ML target and noise variance estimation under a diffuse noise model, and multichannel Wiener filtering
+ * [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=3|example_denoising.m]: TDOA estimation by GCC-PHATmax, ML target and noise variance estimation under a diffuse noise model, and multichannel Wiener filtering
Due to the specific construction of the dataset, at least four strategies may be employed to process the domestic environment mixtures: @@ -Lignes: 103-110 changées en +Lignes: 110-117 @@
For the domestic environment dataset, the CHiME file naming convention is also acceptable.
- Each participant should then send an email to "shoko (at) lab.ntt.co.jp", "nesta (a) fbk.eu" and "emmanuel.vincent (at) inria.fr" providing:
o contact information (name, affiliation) o basic information about his/her algorithm, including its average running time (in seconds per test excerpt and per GHz of CPU) and a bibliographical reference if possible o the URL of the tarball
+ Each participant should then send an email to "araki.shoko (at) lab.ntt.co.jp", "nesta (a) fbk.eu" and "emmanuel.vincent (at) inria.fr" providing:
* contact information (name, affiliation) * basic information about his/her algorithm, including the __employed processing strategy__ among the four strategies outlined above, its average running time (in seconds per test excerpt and per GHz of CPU) and a bibliographical reference if possible * the URL of the tarball The submitted audio files will be made available on a website under the terms of the Licensing section below. @@ -Lignes: 117-121 changées en +Lignes: 124-128 @@
The estimated speaker DOAs in task 1 will be evaluated in terms of absolute difference with the true DOAs.
- The estimated speech signals in task 2 will be evaluated via the energy ratio criteria defined in the [http://bass-db.gforge.inria.fr/bss_eval/|BSS_EVAL] toolbox allowing arbitrary filtering between the estimated source and the true source and via the perceptually-motivated criteria in the [http://bass-db.gforge.inria.fr/peass/PEASS-Software.html|PEASS] toolkit.
+ The estimated speech signals in task 2 will be evaluated via the energy ratio criteria defined in the [http://bass-db.gforge.inria.fr/bss_eval/|BSS_EVAL] toolbox allowing arbitrary filtering between the estimated source and the true source.
The estimated speech and noise spatial image signals in task 3 will be evaluated via the energy ratio criteria introduced for the [http://www.irisa.fr/metiss/SASSEC07/?show=criteria|Stereo Audio Source Separation Evaluation Campaign] and via the perceptually-motivated criteria in the [http://bass-db.gforge.inria.fr/peass/PEASS-Software.html|PEASS] toolkit. @@ -Lignes: 124-133 changées en +Lignes: 131-140 @@
The above performance criteria and benchmarks are respectively implemented in
- * [http://sisec2011.wiki.irisa.fr/tiki-download_userfile.php?fileId=2|bss_eval_source_denoising.m]
* [http://sisec2011.wiki.irisa.fr/tiki-download_userfile.php?fileId=1|bss_eval_images_nosort.m]
+ * [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=2|bss_eval_source_denoising.m]
* [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=1|bss_eval_images_nosort.m]
* [http://bass-db.gforge.inria.fr/peass/PEASS-Software-v1.0.zip|PEASS]
* [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=15|sep_ibm.m] * [http://sisec2008.wiki.irisa.fr/tiki-download_file.php?fileId=16|Cochleagram Toolbox]
- An example use is given in [http://sisec2011.wiki.irisa.fr/tiki-download_userfile.php?fileId=3|example_denoising.m].
+ An example use is given in [http://sisec2011.wiki.irisa.fr/tiki-download_file.php?fileId=3|example_denoising.m].
|