Link Code : Google Colab
Introduction
Text classification is a fundamental task in natural language processing (NLP) that involves categorizing text into predefined categories. In this project, I utilized the IndoLU dataset and employed TF-IDF vectorization along with various machine learning algorithms to perform text classification.
Project Description
The aim of this project is to develop a text classification model capable of accurately categorizing Indonesian text data. The IndoLU (Indonesian Language Understanding) dataset was used, and TF-IDF (Term Frequency-Inverse Document Frequency) was applied for text vectorization. Various machine learning models were then trained and evaluated to identify the best-performing model.
Dataset
The IndoLU dataset contains a collection of Indonesian texts labeled with different categories. It is a comprehensive dataset that provides a robust foundation for developing and testing text classification models.
Posting Komentar