Text Data Augmentation Using Generative Adversarial Networks – A Systematic Review

Silva, Kanishka; Can, Burcu; Sarwar, Raheem; Blain, Frederic; Mitkov, Ruslan

doi:10.33919/JCAL.23.1.1

Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/36157

Appears in Collections:	Computing Science and Mathematics Journal Articles
Peer Review Status:	Refereed
Title:	Text Data Augmentation Using Generative Adversarial Networks – A Systematic Review
Author(s):	Silva, Kanishka Can, Burcu Sarwar, Raheem Blain, Frederic Mitkov, Ruslan
Contact Email:	burcu.can@stir.ac.uk
Keywords:	Text Data Augmentation Generative Adversarial Networks Adversarial Training Text Generation
Issue Date:	1-Jun-2023
Date Deposited:	30-Jul-2024
Citation:	Silva K, Can B, Sarwar R, Blain F & Mitkov R (2023) Text Data Augmentation Using Generative Adversarial Networks – A Systematic Review. <i>Journal of Computational and Applied Linguistics (JCAL)</i>, 1, pp. 6-38. https://ojs.nbu.bg/index.php/JCAL; https://doi.org/10.33919/JCAL.23.1.1
Abstract:	Insufficient data is one of the main drawbacks in natural language processing tasks, and the most prevalent solution is to collect a decent amount of data that will be enough for the optimisation of the model. However, recent research directions are strategically moving towards increasing training examples due to the nature of the data-hungry neural models. Data augmentation is an emerging area that aims to ensure the diversity of data without attempting to collect new data exclusively to boost a model's performance. 7 Limitations in data augmentation, especially for textual data, are mainly due to the nature of language data, which is precisely discrete. Generative Ad-versarial Networks (GANs) were initially introduced for computer vision applications , aiming to generate highly realistic images by learning the image representations. Recent research has focused on using GANs for text generation and augmentation. This systematic review aims to present the theoretical background of GANs and their use for text augmentation alongside a systematic review of recent textual data augmentation applications such as sentiment analysis, low resource language generation, hate speech detection and fraud review analysis. Further, a notion of challenges in current research and future directions of GAN-based text augmentation are discussed in this paper to pave the way for researchers especially working on low-text resources.
URL:	https://ojs.nbu.bg/index.php/JCAL
DOI Link:	10.33919/JCAL.23.1.1
Rights:	As far as we can ascertain there are no restrictions to prevent this work being made publicly available in this repository. If you are aware of any restrictions please contact us (repository.librarian@stir.ac.uk) and we will immediately remove the work from public view.
Licence URL(s):	https://storre.stir.ac.uk/STORREEndUserLicence.pdf

Files in This Item:

File	Description	Size	Format
1_33_compressed_(1).pdf	Fulltext - Published Version	260.22 kB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show full item record

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.

STORRE

STORRE: Stirling Online Research Repository