Papers and Book Chapters

List of publications

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Z

L. Schönherr, K. Kohls, S. Zeiler, T. Holz, D. Kolossa: "Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding," accepted for publication, NDSS 2019.

Freiwald, J., Karbasi, M., Zeiler, S., Melchior, J., Kompella, V., Wiskott, L., Kolossa, D., (2018). “Utilizing Slow Feature Analysis for Lipreading”, in Speech communication: 13. ITG-Fachtagung Sprachkommunikation 10.- 12. Oktober 2018 in Oldenburg, ed. Simon Doclo and Peter Jax, 191–95. ITG-Fachbericht 282. Berlin: VDE VERLAG.

Zeiler, S., Meutzner, H., Abdelaziz A. H., Kolossa, D. (2016), "Introducing the Turbo-Twin-HMM for Audio-Visual Speech Enhancement", Proc. INTERSPEECH, San Francisco, USA, September 2016.

Gergen, S., Zeiler, S., Hussen Abdelaziz, A., Kolossa, D. (2016). "New Insights into Turbo-Decoding-Based AVSR with Dynamic Stream Weights", ITG-Fachtagung Sprachkommunikation, Paderborn, Germany, Oct. 2016.

Gergen, S., Zeiler, S., Hussen Abdelaziz, A., Nickel, R., Kolossa, D. (2016). "Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR", Interspeech 2016, San Francisco, Sept. 2016.

Hussen Abdelaziz, A., Zeiler, S., Kolossa, D. (2014). "A new EM estimation of dynamic stream weights for coupled-HMM-based audio-visual ASR", Proc. ICASSP, Florence, May 2014.

Zeiler, S., Cwiklak, J., Kolossa, D. (2014). "Robust Multimodal Human Machine Interaction using the Kinect Sensor", Proc. ITG Fachtagung Sprachkommunikation, September 2014.

Astudillo, F. R., Kolossa, D., Abad, A., Zeiler, S., Saeidi, R., Mowlaee, R., da Silva Neto, J. P., Martin, R. (2013). "Integration of Beamforming and Uncertainty-of-Observation Techniques for Robust ASR in Multi-Source Environments", Computer Speech and Language, Special Issue on Multisource Environments, vol. 27, no. 3, pp. 837-850, May 2013.

Hussen Abdelaziz, A., Zeiler, S., Kolossa, D. (2013). ''Using Twin-HMM-Based Audio-Visual Speech Enhancement as a Front-End for Robust Audio-Visual Speech Recognition'', Proc. Interspeech, Lyon, France, August 2013.

Hussen Abdelaziz, A., Zeiler, S., Kolossa, D. (2013). ''Twin-HMM-based audio-visual speech enhancement'', Proc. ICASSP, Vancouver, Canada, May 2013.

Hussen Abdelaziz, A., Zeiler, S., Kolossa, D., Leutnant, V., Haeb-Umbach, R. (2013). ''GMM-based Significance Decoding'', Proc. ICASSP, Vancouver, Canada, May 2013.

Kolossa, D., Zeiler, S., Saeidi, R., Astudillo, F. R. (2013). “Noise-Adaptive LDA: A New Approach for Speech Recognition Under Observation Uncertainty", IEEE Signal Processing Letters, vol. 20, no. 11, pp. 1018-1021, 2013.

Meutzner, H., Schlesinger, A., Zeiler, S., Kolossa, D. (2013). "Binaural Signal Processing for Enhanced Speech Recognition Robustness in Complex Listening Environments", Proc. 2nd CHiME Workshop on Machine Listening in Multisource Environments, Vancouver, Canada, June 2013.

Hussen Abdelaziz, A., Zeiler, S., Kolossa, D. (2012). "Audio-Visual Speech Recognition for Uncertain Acoustical Observations", ITG Fachtagung Sprachkommunikation, (2012).

Nickel, R., Astudillo, F. R., Kolossa, D., Zeiler, S., Martin, R. (2012). "Inventory-Style Speech Enhancement with Uncertainty-of- Observation Techniques", ICASSP, pp. 4645-4648, Kyoto, Japan, March 2012.

Kolossa, D., Astudillo, R. F., Abad, A., Zeiler, S., Saeidi, R., Mowlaee, P., da Silva Neto, J. P., Martin, R. (2011). “CHIME Challenge: Approaches to Robustness using Beamforming and Uncertainty-of-Observation Techniques”, in Proc. CHiME 2011 - to appear in Workshop on Machine Listening in Multisource Environments, Interspeech 2011 satellite event.

Vorwerk, A., Zeiler, S., Kolossa, D., Astudillo, F. R., Lerch, D. (2011). “Use of Missing and Unreliable Data for Audiovisual Speech Recognition”, in: Kolossa, D., Haeb-Umbach, R. (eds.): „Robust Speech Recognition of Uncertain or Missing Data - Theory and Applications“, Springer Verlag, pp. 345-375, July 2011.

Kolossa, D., Astudillo, F. R., Abad, A., Zeiler, S., Saeidi, R., Mowlaee, P., da Silva Neto, J.P., Martin, R. (2011). “CHiME Challenge: Approaches to Robustness using Beamforming and Uncertainty-of-Observation Techniques”, to appear in Proc. CHiME Workshop on Machine Listening in Multisource Environments, Florence, Italy, Sept.1, 2011.

Kolossa, D., Astudillo, F. R., Zeiler, S. , Vorwerk, A., Lerch, D., Chong, J., Orglmeister, R. (2010). “Missing Feature Audiovisual Speech Recognition under Real-Time Constraints”, ITG Fachtagung Sprachkommunikation, paper 22, 4 pages, Bochum, Germany, October 6-8, 2010.

Kolossa, D., Chong, J., Zeiler, S., Keutzer, K. (2010). “Efficient Manycore CHMM Speech Recognition for Audiovisual and Multistream Data”, Proc. Interspeech 2010, pp. 2698 – 2701, Makuhari, Japan, September 26-30, 2010.

Vorwerk, A., Wang, X., Kolossa, D., Zeiler, S., Orglmeister, R. (2010). "WAPUSK20 - A Database for Robust Audiovisual Speech Recognition", Proc. 7th Int. Conf. on International Language Resources and Evaluation (ELREC), pp. 3016 – 3019, 2010.

Kolossa, D., Zeiler, S., Vorwerk, A., Orglmeister, R.(2009). "Audiovisual Speech Recognition with Missing or Unreliable Data", Audiovisual Speech Processing Workshop (AVSP 2009), Brighton, UK, September 10-13, 2009.

Anfang

Zohourian, M., & Martin, R. (2019). Binaural Direct-to-Reverberant Energy Ratio and Speaker Distance Estimation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 1. (https://doi.org/10.1109/TASLP.2019.2948730)

Zohourian, M., & Martin, R. (2019). Direct-to-reverberant Energy Ratio Estimation Based on Interaural Coherence and a Joint ITD/ILD Model. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing: Proceedings : May 12-17, 2019, Brighton Conference Centre, Brighton, United Kingdom (pp. 611–615). IEEE. (https://doi.org/10.1109/ICASSP.2019.8683336)

Zohourian, M., Stinner, J., & Martin, R. (2019). Speaker Distance Estimation using Binaural Hearing Aids and Deep Neural Networks. In M. Ochmann, M. Vorländer, & J. Fels (Eds.), Proceedings of the 23rd International Congress on Acoustics : integrating 4th EAA Euroregio 2019 : 9-13 September 2019 in Aachen, Germany (pp. 3297–3304). Deutsche Gesellschaft für Akustik. (https://doi.org/10.18154/RWTH-CONV-239773)

Zohourian, M., Martin, R. (2018). "GSC-based Binaural Speaker Separation Preserving Spatial Cues", in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Canada, April 2018.

Zohourian, M., Enzner, G., Martin, R. (2018). "Binaural Speaker Localization Integrated in an Adaptive Beamformer for Hearing Aids", IEEE/ACM Trans. Audio, Speech, and Language Processing, Vol. 26, no. 3, pp. 515-528, March 2018.

Zohourian, M., Martin, R., Madhu, N. (2017). “New insights into the role of the head radius in model-based binaural speaker localization", in 2017 25th European Signal Processing Conference (EUSIPCO), Aug 2017, pp. 221– 225.

Zohourian, M., Enzner, G., Martin, R. (2017). "Binaural Speaker Localization Integrated in an Adaptive Beamformer for Hearing Aids", IEEE/ACM Trans. Audio, Speech, and Language Processing, Vol. 26, no. 3, pp. 515-528, March 2018.

Anfang

Institute of Communication acoustics

Publications

List of publications