A Systematic Literature Review of BERT-based Models for Natural Language Processing Tasks
Main Article Content
Abstract
Research area in natural language processing (NLP) domain has made major advances in recent years. The Bidirectional Encoder Representations from Transformers (BERT) and its derivative models have been at the vanguard, gaining notice for their exceptional performance across a variety of NLP applications. As a response to this context, hence, this study aims to conduct a systematic literature review on current research in BERT-based models in order to describe their characteristic variations on three frequently demanded natural language processing (NLP) tasks, i.e. text classification, question answering, and text summarization. This study employed a systematic literature review method as prescribed by Kitchenham. We collected 4,120 papers from publications indexed by Scopus and Google Scholar from which 42 complied to our defined review criteria and finally chosen for further analysis. Our review came up with three conclusions. First, in order to select appropriate models for particular NLP tasks, three primary concerns should be considered: i) the type of NLP problem to be resolved (i.e. NLP task to be served), ii) the specific domain to be handled (such as financial, medical, law/legal or others), and iii) the intended language to be applied (such as English or others). Second, learning rate, batch size, and the type of optimizer were the three most considered hyperparameters to be properly arranged in model training. Third, the most widely used metrics for text classification tasks were F1-score, accuracy, precision, and sensitivity (recall), while question answering, and text summarization tasks were mostly used the Exact Match and ROUGE respectively.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work