Please use this identifier to cite or link to this item:
http://hdl.handle.net/1893/28051
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Böschen, Falk | en_UK |
dc.contributor.author | Scherp, Ansgar | en_UK |
dc.contributor.editor | Görg, S | en_UK |
dc.contributor.editor | Bergmann, R | en_UK |
dc.contributor.editor | Müller, G | en_UK |
dc.date.accessioned | 2018-11-06T14:29:50Z | - |
dc.date.available | 2018-11-06T14:29:50Z | - |
dc.date.issued | 2015-12-31 | en_UK |
dc.identifier.uri | http://hdl.handle.net/1893/28051 | - |
dc.description.abstract | We propose a pipeline for text extraction from infographics that makes use of a novel combination of data mining and computer vision techniques. The pipeline defines a sequence of steps to identify characters, cluster them into text lines, determine their rotation angle, and apply state-of-the-art OCR to recognise the text. In this paper, we formally define the pipeline and present its current implementation. In addition, we have conducted preliminary evaluations over a data corpus of 121 manually annotated infographics from a broad range of illustration types such as bar charts, pie charts, and line charts, maps, and others. We assess the results of our text extraction pipeline by comparing it with two baselines. Finally, we sketch an outline for future work and possibilities for improving the pipeline. | en_UK |
dc.publisher | CEUR Workshop Proceedings | en_UK |
dc.relation | Böschen F & Scherp A (2015) Formalization and preliminary evaluation of a pipeline for text extraction from infographics. In: Görg S, Bergmann R & Müller G (eds.) Proceedings of the LWA 2015 Workshops: KDML, FGWM, IR, and FGDB, volume 1458. CEUR Workshop Proceedings, 1458. LWA 2015 Workshops: KDML, FGWM, IR, FGD, Trier, Germany, 07.10.2015-09.10.2015. Aachen, Germany: CEUR Workshop Proceedings, pp. 20-31. http://ceur-ws.org/Vol-1458/D03_CRC13_Boeschen.pdf | en_UK |
dc.relation.ispartofseries | CEUR Workshop Proceedings, 1458 | en_UK |
dc.rights | The copyright is owned by default by the authors. Copying is permitted only for private and academic purposes. The permission for academic use implies an attribution obligation, i.e., you must properly cite the items that you use in your own published work. Modification is not permitted unless a suitable license is granted by its copyright owners. Copying or use for commercial purposes is forbidden unless an explicit permission is acquired from the copyright owners. | en_UK |
dc.subject | Infographics | en_UK |
dc.subject | OCR | en_UK |
dc.subject | multi-oriented text extraction | en_UK |
dc.subject | formalization | en_UK |
dc.title | Formalization and preliminary evaluation of a pipeline for text extraction from infographics | en_UK |
dc.type | Conference Paper | en_UK |
dc.citation.jtitle | CEUR Workshop Proceedings | en_UK |
dc.citation.issn | 1613-0073 | en_UK |
dc.citation.volume | 1458 | en_UK |
dc.citation.spage | 20 | en_UK |
dc.citation.epage | 31 | en_UK |
dc.citation.publicationstatus | Published | en_UK |
dc.type.status | VoR - Version of Record | en_UK |
dc.identifier.url | http://ceur-ws.org/Vol-1458/D03_CRC13_Boeschen.pdf | en_UK |
dc.citation.btitle | Proceedings of the LWA 2015 Workshops: KDML, FGWM, IR, and FGDB | en_UK |
dc.citation.conferencedates | 2015-10-07 - 2015-10-09 | en_UK |
dc.citation.conferencelocation | Trier, Germany | en_UK |
dc.citation.conferencename | LWA 2015 Workshops: KDML, FGWM, IR, FGD | en_UK |
dc.citation.isbn | N/A | en_UK |
dc.publisher.address | Aachen, Germany | en_UK |
dc.contributor.affiliation | University of Kiel | en_UK |
dc.contributor.affiliation | Leibniz Information Centre for Economics - ZBW | en_UK |
dc.identifier.scopusid | 2-s2.0-84944322158 | en_UK |
dc.identifier.wtid | 1007296 | en_UK |
dc.contributor.orcid | 0000-0002-2653-9245 | en_UK |
dc.date.accepted | 2015-08-17 | en_UK |
dcterms.dateAccepted | 2015-08-17 | en_UK |
dc.date.filedepositdate | 2018-10-22 | en_UK |
rioxxterms.apc | not required | en_UK |
rioxxterms.type | Conference Paper/Proceeding/Abstract | en_UK |
rioxxterms.version | VoR | en_UK |
local.rioxx.author | Böschen, Falk| | en_UK |
local.rioxx.author | Scherp, Ansgar|0000-0002-2653-9245 | en_UK |
local.rioxx.project | Internal Project|University of Stirling|https://isni.org/isni/0000000122484331 | en_UK |
local.rioxx.contributor | Görg, S| | en_UK |
local.rioxx.contributor | Bergmann, R| | en_UK |
local.rioxx.contributor | Müller, G| | en_UK |
local.rioxx.freetoreaddate | 2018-10-22 | en_UK |
local.rioxx.licence | http://www.rioxx.net/licenses/all-rights-reserved|2018-10-22| | en_UK |
local.rioxx.filename | Böschen-Scherp-2015.pdf | en_UK |
local.rioxx.filecount | 1 | en_UK |
local.rioxx.source | N/A | en_UK |
Appears in Collections: | Computing Science and Mathematics Conference Papers and Proceedings |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Böschen-Scherp-2015.pdf | Fulltext - Published Version | 433.5 kB | Adobe PDF | View/Open |
This item is protected by original copyright |
Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/
If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.