Multi-modal Speech Processing Methods: An Overview and Future Research Directions Using a MATLAB Based Audio-Visual Toolbox

Abel, Andrew; Hussain, Amir

doi:10.1007/978-3-642-00525-1_12

Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/10874

Full metadata record

DC Field	Value	Language
dc.contributor.author	Abel, Andrew	en_UK
dc.contributor.author	Hussain, Amir	en_UK
dc.contributor.editor	Esposito, A	en_UK
dc.contributor.editor	Hussain, A	en_UK
dc.contributor.editor	Marinaro, M	en_UK
dc.contributor.editor	Martone,	en_UK
dc.contributor.editor	R,	en_UK
dc.date.accessioned	2014-05-27T23:10:57Z	-
dc.date.available	2014-05-27T23:10:57Z	en_UK
dc.date.issued	2009	en_UK
dc.identifier.uri	http://hdl.handle.net/1893/10874	-
dc.description.abstract	This paper presents an overview of the main multi-modal speech enhancement methods reported to date. In particular, a new MATLAB based Toolbox developed by Barbosa et al (2007) for processing audio-visual data is reviewed and its performance potential evaluated. It is shown that the tool does not represent a complete and comprehensive speech processing solution, but rather serves as a standardised, yet versatile base to build upon with further research. To demonstrate this versatility, preliminary examples that make use of these computational procedures with an audiovisual corpus are demonstrated. Finally, some future research directions in the area of multi-modal speech processing are outlined, including future research that the authors aim to carry out with the aid of this newly developed audio-visual MATLAB toolbox, including toolbox customisation, and processing noisy speech in real world environments.	en_UK
dc.language.iso	en	en_UK
dc.publisher	Springer-Verlag	en_UK
dc.relation	Abel A & Hussain A (2009) Multi-modal Speech Processing Methods: An Overview and Future Research Directions Using a MATLAB Based Audio-Visual Toolbox. In: Esposito A, Hussain A, Marinaro M, Martone & R (eds.) Multimodal Signals: Cognitive and Algorithmic Issues: COST Action 2102 and euCognition International School Vietri sul Mare, Italy, April 21-26, 2008: Revised, Selected and Invited Papers. Lecture Notes in Computer Science, 5398. Berlin, Germany: Springer-Verlag, pp. 121-129. http://link.springer.com/chapter/10.1007%2F978-3-642-00525-1_12; https://doi.org/10.1007/978-3-642-00525-1_12	en_UK
dc.relation.ispartofseries	Lecture Notes in Computer Science, 5398	en_UK
dc.rights	The publisher does not allow this work to be made publicly available in this Repository. Please use the Request a Copy feature at the foot of the Repository record to request a copy directly from the author. You can only request a copy if you wish to use this work for your own research or private study.	en_UK
dc.rights.uri	http://www.rioxx.net/licenses/under-embargo-all-rights-reserved	en_UK
dc.subject	C	en_UK
dc.subject	DIRECTION	en_UK
dc.subject	Future	en_UK
dc.subject	method	en_UK
dc.subject	methods	en_UK
dc.subject	Research	en_UK
dc.subject	Speech	en_UK
dc.title	Multi-modal Speech Processing Methods: An Overview and Future Research Directions Using a MATLAB Based Audio-Visual Toolbox	en_UK
dc.type	Part of book or chapter of book	en_UK
dc.rights.embargodate	3000-12-01	en_UK
dc.rights.embargoreason	[Abel_2009_Multi-modal_Speech_Processing_Methods.pdf] The publisher does not allow this work to be made publicly available in this Repository therefore there is an embargo on the full text of the work.	en_UK
dc.identifier.doi	10.1007/978-3-642-00525-1_12	en_UK
dc.citation.issn	0302-9743	en_UK
dc.citation.spage	121	en_UK
dc.citation.epage	129	en_UK
dc.citation.publicationstatus	Published	en_UK
dc.type.status	VoR - Version of Record	en_UK
dc.identifier.url	http://link.springer.com/chapter/10.1007%2F978-3-642-00525-1_12	en_UK
dc.author.email	aka@cs.stir.ac.uk	en_UK
dc.citation.btitle	Multimodal Signals: Cognitive and Algorithmic Issues: COST Action 2102 and euCognition International School Vietri sul Mare, Italy, April 21-26, 2008: Revised, Selected and Invited Papers	en_UK
dc.citation.isbn	978-3642005244	en_UK
dc.publisher.address	Berlin, Germany	en_UK
dc.contributor.affiliation	Computing Science	en_UK
dc.contributor.affiliation	Computing Science	en_UK
dc.identifier.wtid	793662	en_UK
dc.contributor.orcid	0000-0002-8080-082X	en_UK
dcterms.dateAccepted	2009-12-31	en_UK
dc.date.filedepositdate	2013-02-06	en_UK
rioxxterms.type	Book chapter	en_UK
rioxxterms.version	VoR	en_UK
local.rioxx.author	Abel, Andrew\|	en_UK
local.rioxx.author	Hussain, Amir\|0000-0002-8080-082X	en_UK
local.rioxx.project	Internal Project\|University of Stirling\|https://isni.org/isni/0000000122484331	en_UK
local.rioxx.contributor	Esposito, A\|	en_UK
local.rioxx.contributor	Hussain, A\|	en_UK
local.rioxx.contributor	Marinaro, M\|	en_UK
local.rioxx.contributor	Martone, \|	en_UK
local.rioxx.contributor	R, \|	en_UK
local.rioxx.freetoreaddate	3000-12-01	en_UK
local.rioxx.licence	http://www.rioxx.net/licenses/under-embargo-all-rights-reserved\|\|	en_UK
local.rioxx.filename	Abel_2009_Multi-modal_Speech_Processing_Methods.pdf	en_UK
local.rioxx.filecount	1	en_UK
local.rioxx.source	978-3642005244	en_UK
Appears in Collections:	Computing Science and Mathematics Book Chapters and Sections

Files in This Item:

File	Description	Size	Format
Abel_2009_Multi-modal_Speech_Processing_Methods.pdf	Fulltext - Published Version	694.02 kB	Adobe PDF	Under Embargo until 3000-12-01 Request a copy

This item is protected by original copyright

View License

Show simple item record

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.

STORRE

STORRE: Stirling Online Research Repository