A Data Driven Approach to Audiovisual Speech Mapping

Abel, Andrew; Marxer, Ricard; Hussain, Amir; Barker, Jon; Watt, Roger; Whitmer, Bill; Derleth, Peter

doi:10.1007/978-3-319-49685-6_30

Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/24710

Full metadata record

DC Field	Value	Language
dc.contributor.author	Abel, Andrew	en_UK
dc.contributor.author	Marxer, Ricard	en_UK
dc.contributor.author	Hussain, Amir	en_UK
dc.contributor.author	Barker, Jon	en_UK
dc.contributor.author	Watt, Roger	en_UK
dc.contributor.author	Whitmer, Bill	en_UK
dc.contributor.author	Derleth, Peter	en_UK
dc.contributor.editor	Liu, CL	en_UK
dc.contributor.editor	Hussain, A	en_UK
dc.contributor.editor	Luo, B	en_UK
dc.contributor.editor	Tan, KC	en_UK
dc.contributor.editor	Zeng, Y	en_UK
dc.contributor.editor	Zhang, Z	en_UK
dc.date.accessioned	2017-08-26T07:37:52Z	-
dc.date.available	2017-08-26T07:37:52Z	-
dc.date.issued	2016-12	en_UK
dc.identifier.uri	http://hdl.handle.net/1893/24710	-
dc.description.abstract	The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.	en_UK
dc.language.iso	en	en_UK
dc.publisher	Springer	en_UK
dc.relation	Abel A, Marxer R, Hussain A, Barker J, Watt R, Whitmer B & Derleth P (2016) A Data Driven Approach to Audiovisual Speech Mapping. In: Liu C, Hussain A, Luo B, Tan K, Zeng Y & Zhang Z (eds.) Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, 10023. BICS 2016: International Conference on Brain Inspired Cognitive Systems, Beijing, China, 28.11.2016-30.11.2016. Cham, Switzerland: Springer, pp. 331-342. https://doi.org/10.1007/978-3-319-49685-6_30	en_UK
dc.relation.ispartofseries	Lecture Notes in Computer Science, 10023	en_UK
dc.rights	Publisher policy allows this work to be made available in this repository. Published in Liu CL., Hussain A., Luo B., Tan K., Zeng Y., Zhang Z. (eds) Advances in Brain Inspired Cognitive Systems. BICS 2016. Lecture Notes in Computer Science, vol 10023, published by Springer. The original publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-49685-6_30	en_UK
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/	en_UK
dc.subject	Audiovisual	en_UK
dc.subject	Speech processing	en_UK
dc.subject	Speech mapping	en_UK
dc.subject	ANNs	en_UK
dc.title	A Data Driven Approach to Audiovisual Speech Mapping	en_UK
dc.type	Conference Paper	en_UK
dc.identifier.doi	10.1007/978-3-319-49685-6_30	en_UK
dc.citation.issn	0302-9743	en_UK
dc.citation.spage	331	en_UK
dc.citation.epage	342	en_UK
dc.citation.publicationstatus	Published	en_UK
dc.type.status	AM - Accepted Manuscript	en_UK
dc.contributor.funder	Engineering and Physical Sciences Research Council	en_UK
dc.author.email	r.j.watt@stir.ac.uk	en_UK
dc.citation.btitle	Advances in Brain Inspired Cognitive Systems	en_UK
dc.citation.conferencedates	2016-11-28 - 2016-11-30	en_UK
dc.citation.conferencelocation	Beijing, China	en_UK
dc.citation.conferencename	BICS 2016: International Conference on Brain Inspired Cognitive Systems	en_UK
dc.citation.date	30/11/2016	en_UK
dc.citation.isbn	978-3-319-49685-6	en_UK
dc.publisher.address	Cham, Switzerland	en_UK
dc.contributor.affiliation	Computing Science	en_UK
dc.contributor.affiliation	University of Sheffield	en_UK
dc.contributor.affiliation	Computing Science	en_UK
dc.contributor.affiliation	University of Sheffield	en_UK
dc.contributor.affiliation	Psychology	en_UK
dc.contributor.affiliation	Medical Research Council Institute of Hearing Research	en_UK
dc.contributor.affiliation	Sonova International	en_UK
dc.identifier.scopusid	2-s2.0-84997282854	en_UK
dc.identifier.wtid	542655	en_UK
dc.contributor.orcid	0000-0002-8080-082X	en_UK
dc.contributor.orcid	0000-0001-8660-1875	en_UK
dc.date.accepted	2016-08-24	en_UK
dcterms.dateAccepted	2016-08-24	en_UK
dc.date.filedepositdate	2016-12-13	en_UK
dc.relation.funderproject	Towards visually-driven speech enhancement for cognitively-inspired multi-modal hearing-aid devices	en_UK
dc.relation.funderref	EP/M026981/1	en_UK
rioxxterms.apc	not required	en_UK
rioxxterms.type	Conference Paper/Proceeding/Abstract	en_UK
rioxxterms.version	AM	en_UK
local.rioxx.author	Abel, Andrew\|	en_UK
local.rioxx.author	Marxer, Ricard\|	en_UK
local.rioxx.author	Hussain, Amir\|0000-0002-8080-082X	en_UK
local.rioxx.author	Barker, Jon\|	en_UK
local.rioxx.author	Watt, Roger\|0000-0001-8660-1875	en_UK
local.rioxx.author	Whitmer, Bill\|	en_UK
local.rioxx.author	Derleth, Peter\|	en_UK
local.rioxx.project	EP/M026981/1\|Engineering and Physical Sciences Research Council\|http://dx.doi.org/10.13039/501100000266	en_UK
local.rioxx.contributor	Liu, CL\|	en_UK
local.rioxx.contributor	Hussain, A\|	en_UK
local.rioxx.contributor	Luo, B\|	en_UK
local.rioxx.contributor	Tan, KC\|	en_UK
local.rioxx.contributor	Zeng, Y\|	en_UK
local.rioxx.contributor	Zhang, Z\|	en_UK
local.rioxx.freetoreaddate	2016-12-16	en_UK
local.rioxx.licence	http://creativecommons.org/licenses/by-nc-sa/4.0/\|2016-12-16\|	en_UK
local.rioxx.filename	abelBics2016Paper-final-submitted.pdf	en_UK
local.rioxx.filecount	1	en_UK
local.rioxx.source	978-3-319-49685-6	en_UK
Appears in Collections:	Psychology Book Chapters and Sections

Files in This Item:

File	Description	Size	Format
abelBics2016Paper-final-submitted.pdf	Fulltext - Accepted Version	193.3 kB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show simple item record

A file in this item is licensed under a Creative Commons License

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.

STORRE

STORRE: Stirling Online Research Repository