Historique: Determined convolutive mixtures under dynamic conditions
Comparaison de la version 17 à la version 27
@@ -Lignes: 1-5 changées en +Lignes: 1-3 @@
!::Determined convolutive mixtures under dynamic conditions::
-
- !::~~#FF0000:Under construction~~:: Blind source separation (BSS) in real-world environments is a challenging task even for the simplest well determined case where the number of the sources is known in advance and is less or equal to the number of the microphones. For this reason the experimental evaluation of most of the algorithms proposed in literature is conducted in controlled scenarios: i.e. the reverberation is not very high, the length of the mixtures is given, the sources are observed for a relatively long time and do not change their locations. However, such conditions do not reflect well a real-word scenario, i.e., the reverberation can not be neglected, many sources can move in the environment or can produce sounds from random locations (e.g. as in a meeting where multiple speakers are in different static locations). Furthermore the source activity is unknown and different sources overlap in different time-instants. @@ -Lignes: 10-13 changées en +Lignes: 8-15 @@
This task is derived from [http://sisec2010.wiki.irisa.fr/tiki-index.php?page=Determined+convolutive+mixtures+under+dynamic+conditions|Determined convolutive mixtures under dynamic conditions] in SiSEC2010. While in SiSEC2010 only two channels were considered, in these datasets four channel mixtures are provided. However, we still consider the case where two speakers are simultaneously active at most. Participants can decide whether using all the available channels or only a subset of them.
+
+ !! Results + + Results for test and development datsets: [http://www.irisa.fr/metiss/SiSEC11/dynamic/main.html|results] !!Description of the datasets @@ -Lignes: 29-39 changées en +Lignes: 31-41 @@
!!!!b) Setup 2:
The two competing sources can be located in the whole angular space (-90;+90), but never on the same angular direction (see figure 3).
- The source mixtures are obtained by summing the individual source components recorded by each microphone. The components are generated by convolving random utterances with measured impulse responses and contaminated with an additive white Gaussian noise (AWGN) according to an SNR of 40dB.
+ The source mixtures are obtained by summing the individual source components recorded by each microphone. The components are generated by convolving random utterances with measured impulse responses and contaminated with an additive white Gaussian noise (AWGN) according to an SNR of 50dB.
- ::{img src=http://bssnesta.webatu.com/sisec2011/figure2.jpg}::
::__Figure2__: graphical explanation of dataset1, setup1::
+ ::{img src=http://bssnesta.webatu.com/sisec2011/figure2.jpg width=445}::
::__Figure2__: graphical explanation of dataset1, setup1 (a different color means a different source)::
- ::{img src=http://bssnesta.webatu.com/sisec2011/figure3.jpg width=545}::
::__Figure3__: graphical explanation of dataset1, setup2::
+ ::{img src=http://bssnesta.webatu.com/sisec2011/figure3.jpg width=445}::
::__Figure3__: graphical explanation of dataset1, setup2 (a different color means a different source):: !!!Dataset2 @@ -Lignes: 50-66 changées en +Lignes: 52-67 @@
The mixtures were obtained by summing the spatial image (responses) of the individual moving source with those of the static source (the latter is simulated as for dataset1). Note that such a mixture is not realistic in full, because all moving objects affect all source-microphone impulse responses. However, individual spatial images are required for a more accurate performance evaluation.
- The source mixtures are obtained by summing the individual source components recorded by each microphone. The components are generated by convolving random utterances with measured impulse responses and contaminated with an additive white Gaussian noise (AWGN) according to an SNR of 40dB.
- ::{img src=http://bssnesta.webatu.com/sisec2011/figure4.jpg width=545}::
::__Figure4__: graphical explanation of dataset2, scenario a)::
+ ::{img src=http://bssnesta.webatu.com/sisec2011/figure4.jpg width=445}::
::__Figure4__: graphical explanation of dataset2, scenario a) (a different color means a different source)::
- ::{img src=http://bssnesta.webatu.com/sisec2011/figure5.jpg width=545}::
::__Figure5__: graphical explanation of dataset2, scenario b)::
+ ::{img src=http://bssnesta.webatu.com/sisec2011/figure5.jpg width=445}::
::__Figure5__: graphical explanation of dataset2, scenario b) (a different color means a different source):: !!Development datasets The development datasets are in the archive:
- [http://shine.fbk.eu/sites/shine.fbk.eu/files/nesta/sisec2011/dev_dynamic_sisec.zip|dev_dynamic_sisec2011.zip]
+ [http://shine.fbk.eu/sites/shine.fbk.eu/files/nesta/datasets.html|datasets]
It includes the directories/sub-directories: @@ -Lignes: 90-94 changées en +Lignes: 91-95 @@
The test datasets are in the archive:
- [http://shine.fbk.eu/sites/shine.fbk.eu/files/nesta/sisec2011/test_dynamic_sisec.zip|test_dynamic_sisec2011.zip]
+ [http://shine.fbk.eu/sites/shine.fbk.eu/files/nesta/datasets.html|datasets]
Files are organized as in the dev archive. The segmentation file and the individual source image files are not included. Note, the data in the test has been simulated with different instances of impulse responses (i.e. at different angular directions). @@ -Lignes: 96-99 changées en +Lignes: 97-101 @@
!!Tasks
+
We propose the following two tasks:
#__mono source signal estimation__: estimation of the source signals @@ -Lignes: 101-107 changées en +Lignes: 103-117 @@
!!Submission
+
+ Note that the submitted audio files will be made available on a website under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 2.0 license. +
Each participant is asked to submit the estimation results of his/her algorithm for tasks 1 and/or 2 over all or part of mixtures in the test datasets.
- Files should be organized in directories as for "dev" and "test" datasets, including in the folders the output files with the following syntax:
+ Each participant should then send an email to "nesta(at) fbk.eu" providing:
-contact information (name, affiliation) -basic information about his/her algorithm, including its average running time (in seconds per test excerpt and per GHz of CPU) and a bibliographical reference if possible -the URL of the tarball(s) Files have to be organized as for "dev" and "test" datasets, including in the folders the output files with the following syntax: source 1: y_<array spacing>_src_1.wav (single channel .wav file) @@ -Lignes: 112-120 changées en +Lignes: 122-130 @@
!!Evaluation criteria
- Based on the evaluation method for source signal estimation in SiSEC2008, we propose to evaluate the estimated source signal(s) (and/or source images) via the criteria defined in the [http://bass-db.gforge.inria.fr/bss_eval/|BSS_EVAL] toolbox. These criteria allow an arbitrary filtering between the estimated source and the true source and measure interference, distoriton and artifacts separately. All source orderings are tested and the ordering leading to the best SIR is selected, which treats the permutation ambiguity.
/>Several tools for evaluation can be found at previous SiSEC2008 page.
+ Based on the evaluation method for source signal estimation in SiSEC2008, we propose to evaluate the estimated source signal(s) (and/or source images) via the criteria defined in the [http://bass-db.gforge.inria.fr/bss_eval/|BSS_EVAL] toolbox. These criteria allow an arbitrary filtering between the estimated source and the true source and measure interference, distoriton and artifacts separately. All source orderings are tested and the ordering leading to the best SIR is selected, which treats the permutation ambiguity. Several tools for evaluation can be found at previous SiSEC2008 page.
Additional evaluation will be provided through the perceptual evaluation toolkit [http://bass-db.gforge.inria.fr/peass/|PEASS].
!!Potential Participants
+
J. Malek
Z. Koldovsky |