Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/24710
Appears in Collections:Psychology Book Chapters and Sections
Title: A Data Driven Approach to Audiovisual Speech Mapping
Author(s): Abel, Andrew
Marxer, Ricard
Hussain, Amir
Barker, Jon
Watt, Roger
Whitmer, Bill
Derleth, Peter
Contact Email: r.j.watt@stir.ac.uk
Editor(s): Liu, CL
Hussain, A
Luo, B
Tan, KC
Zeng, Y
Zhang, Z
Sponsor: Engineering and Physical Sciences Research Council
Citation: Abel A, Marxer R, Hussain A, Barker J, Watt R, Whitmer B & Derleth P (2016) A Data Driven Approach to Audiovisual Speech Mapping. In: Liu C, Hussain A, Luo B, Tan K, Zeng Y & Zhang Z (eds.) Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, 10023. BICS 2016: International Conference on Brain Inspired Cognitive Systems, Beijing, China, 28.11.2016-30.11.2016. Cham, Switzerland: Springer, pp. 331-342. https://doi.org/10.1007/978-3-319-49685-6_30
Keywords: Audiovisual
Speech processing
Speech mapping
ANNs
Issue Date: Dec-2016
Date Deposited: 13-Dec-2016
Series/Report no.: Lecture Notes in Computer Science, 10023
Abstract: The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.
Rights: Publisher policy allows this work to be made available in this repository. Published in Liu CL., Hussain A., Luo B., Tan K., Zeng Y., Zhang Z. (eds) Advances in Brain Inspired Cognitive Systems. BICS 2016. Lecture Notes in Computer Science, vol 10023, published by Springer. The original publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-49685-6_30
DOI Link: 10.1007/978-3-319-49685-6_30
Licence URL(s): http://creativecommons.org/licenses/by-nc-sa/4.0/

Files in This Item:
File Description SizeFormat 
abelBics2016Paper-final-submitted.pdfFulltext - Accepted Version193.3 kBAdobe PDFView/Open



This item is protected by original copyright



A file in this item is licensed under a Creative Commons License Creative Commons

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.