Indonesian news classification application with named entity recognition approach

Main Article Content

Nurchim Nurchim
Nurmalitasari Nurmalitasari
Zalizah Awang Long

Abstract

Nowadays, many netizens search for news via search engines with countless amounts of information, so it is increasingly difficult to determine when the number of news articles that appear changes very quickly and dynamically. Thus, it is necessary to process the extraction of news information to display the core information of the news. Problems arise, especially in Indonesian, which has a structure of various noun phrase entities with shallow parsing or grammatical induction. Named Entity Recognition (NER) has the opportunity to overcome this because it can extract news entities in depth, starting from proper nouns in text documents containing information search, machine translation, answering questions, and automatic summarization. This study aims to apply NER in Indonesian language news classification. This study uses Design-Based Research whose process includes (1) pre-implementation, (2) design, (3) implementation and revision, and finally, (4) reflection and evaluation. This application was developed on the platform python, streamlit, BeautifulSoup, gnews, and spacy library. The results of application accuracy testing have an F1-score value of 89.69% for all entities consisting of place, figure, day, date, and organization.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
N. Nurchim, N. Nurmalitasari, and Z. Long, “Indonesian news classification application with named entity recognition approach”, INFOTEL, vol. 15, no. 2, pp. 130-134, May 2023.
Section
Informatics