|Appears in Collections:||Computing Science and Mathematics Conference Papers and Proceedings|
|Title:||Formalization and preliminary evaluation of a pipeline for text extraction from infographics|
|Citation:||Böschen F & Scherp A (2015) Formalization and preliminary evaluation of a pipeline for text extraction from infographics. In: Görg S, Bergmann R & Müller G (eds.) Proceedings of the LWA 2015 Workshops: KDML, FGWM, IR, and FGDBvolume 1458. CEUR Workshop Proceedings, 1458. LWA 2015 Workshops: KDML, FGWM, IR, FGD, Trier, Germany, 07.10.2015-09.10.2015. Aachen, Germany: CEUR Workshop Proceedings, pp. 20-31. http://ceur-ws.org/Vol-1458/D03_CRC13_Boeschen.pdf|
|Series/Report no.:||CEUR Workshop Proceedings, 1458|
|Conference Name:||LWA 2015 Workshops: KDML, FGWM, IR, FGD|
|Conference Dates:||2015-10-07 - 2015-10-09|
|Conference Location:||Trier, Germany|
|Abstract:||We propose a pipeline for text extraction from infographics that makes use of a novel combination of data mining and computer vision techniques. The pipeline defines a sequence of steps to identify characters, cluster them into text lines, determine their rotation angle, and apply state-of-the-art OCR to recognise the text. In this paper, we formally define the pipeline and present its current implementation. In addition, we have conducted preliminary evaluations over a data corpus of 121 manually annotated infographics from a broad range of illustration types such as bar charts, pie charts, and line charts, maps, and others. We assess the results of our text extraction pipeline by comparing it with two baselines. Finally, we sketch an outline for future work and possibilities for improving the pipeline.|
|Status:||VoR - Version of Record|
|Rights:||The copyright is owned by default by the authors. Copying is permitted only for private and academic purposes. The permission for academic use implies an attribution obligation, i.e., you must properly cite the items that you use in your own published work. Modification is not permitted unless a suitable license is granted by its copyright owners. Copying or use for commercial purposes is forbidden unless an explicit permission is acquired from the copyright owners.|
|Böschen-Scherp-2015.pdf||Fulltext - Published Version||433.5 kB||Adobe PDF||View/Open|
This item is protected by original copyright
Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
If you believe that any material held in STORRE infringes copyright, please contact email@example.com providing details and we will remove the Work from public display in STORRE and investigate your claim.