The question classification phase is considered as one of the most significant phases in a Question Answering System to help the system find or construct an accurate answer which results in an improvement of the quality of question answering systems. In this work, we proposed a question classification into a 2-layer taxonomy called Coarse-Fine taxonomy. This is the first work for Indonesian question classification into Coarse-Fine taxonomy. We employed a feature selection and machine learning classification using Support Vector Machine algorithm. In the feature selection, we found that Unigram+TFIDF+Word Shape is the best combination that reached the highest accuracy with 92.9% in Coarse category. On the other hand, the combination of Unigram+TFIDF+WH word features is the best combination for Fine category with 79.3% accuracy.
Indonesian question classification using feature extraction and selection approach on coarse and fine taxonomy
Irfandy Thalib, Widyawan, Indah Soesanti; Indonesian question classification using feature extraction and selection approach on coarse and fine taxonomy. AIP Conf. Proc. 14 February 2023; 2654 (1): 020007. https://doi.org/10.1063/5.0114188
Download citation file: