Sentiment Prediction Accuracy of Amazon Fine Food Review using TFIDF and LightGBM models


  • Tanzilal Mustaqim State University of Semarang
  • Aprilia Dewi Ardiyanti State University of Malang


Keywords: Amazon Review, LightGBM, TF-IDF, Sentiment Analysis.


Abstract. Changes in the pattern of society in meeting their needs develop as the times progress from conventional to digital. This makes service providers need to change business work patterns towards digitizing buying and selling transactions. Service providers serve consumer needs digitally and maintain optimal service patterns. One of the efforts to maintain optimal service is through community response to services, both positive and negative. The community response can be analyzed using sentiment analysis. This study focuses on the analysis of the accuracy of sentiment predictions on the Amazon fine food review dataset, which was taken as many as 20,000 data samples. The analysis was carried out in various stages, namely dataset collection, data preprocessing, TF-IDF, and LightGBM. The test results used TF-IDF and LightGBM with TF-IDF parameter settings of 1 to 2 grams and LightGBM parameter settings with a max_depth of 50. Num_leaves used were 40 and the learning rate was 0.1 on the Amazon Review dataset which took 20,000 samples. The analysis carried out resulted in a predictive level of sentiment accuracy above 90%, reaching 93.2%. 


Download data is not yet available.


Agarwal, Raina & Pradeep, Y. 2013. Bridging the gap between traditional and online shopping methods for Indian customers trough digital interactive experience. Proceedings of the 2013 International Conference on Advances in Computing, Communications and Informatics. India: 22-25 August 2013. Al-khalifa, Shaima, Aljarah, Ibrahim, & Abushariah, Mohammad A M. 2020. Hate Speech Classification in Arabic Tweets. Journal of Theoretical and Applied Information Technology 98: 1816–1831. Al Amrani, Yassine, Lazaar, Mohamed, & El Kadirp, Kamal Eddine. 2018. Sentiment Analysis using supervised classification algorithms. Procedia Computer Science 127: 511–520. Anees, Aiman Abdullah, Prakash Gupta, Harsh, Dalvi, Aditya Prashant, Gopinath, Suhas, & Mohan, Biju R. 2019. Performance Analysis of Multiple Classifiers using different Term Weighting Schemes
for Sentiment Analysis. International Conference on Intelligent Computing and Control Systems. India: 15-17 May 2019. Bi, Ye, Wang, Shuo, & Fan, Zhongrui. 2020. A Multimodal Late Fusion Model for E-commerce Product Classification. ArXiv Preprint ArXiv 2008: 1-4. Chandra Pandey, Avinash, Singh Rajpoot, Dharmveer, & Saraswat, Mukesh. 2017. Twitter sentiment analysis using hybrid cuckoo search method. Information Processing and Management 53: 764– 779. Demirkan, Haluk, & Spohrer, Jim. 2014. Developing a framework to improve virtual shopping in digital malls with intelligent selfservice systems. Journal of Retailing and Consumer Services 21: 860–868. Saif, Hassan, He, Yulan, Fernandez, Miriam, & Alani, Harith. 2016. Contextual semantics for sentiment analysis of Twitter. Information Processing and Management 52: 5–19. Veera, K Mani, Ratna, Venkata, Anusha, M Sai, Tejaswini, S, & Swamy, B Tirupati. 2020. IX (V): 112–120.




How to Cite

Mustaqim, T., & Ardiyanti , A. D. (2021). Sentiment Prediction Accuracy of Amazon Fine Food Review using TFIDF and LightGBM models. Proceeding International Conference on Science and Engineering, 4, 216–219. Retrieved from