Comparative Analysis of the Accuracy of Multiple Linear Regression Method and Ridge Regression Method in Predicting Dengue Fever Cases in South Tangerang City
##plugins.themes.academic_pro.article.main##
Abstract
One of the main health issues in South Tangerang City is dengue fever (DBD). This study aims to compare the accuracy of Multiple Linear Regression and Ridge Regression methods in predicting the number of DBD cases using weather data such as temperature, humidity, and average rainfall. The data used is monthly data from South Tangerang City. The analysis process includes preprocessing, splitting the dataset into training and testing data, and applying both regression methods. To determine the prediction error rate, model accuracy is evaluated using the Mean Absolute Percentage Error (MAPE) metric. The results indicate that Ridge Regression performs better for datasets with high multicollinearity, yielding a MAPE value of 20.12%, while Multiple Linear Regression is more effective for datasets with low feature correlation, showing a MAPE value of 44.6%. This study provides important insights into selecting predictive techniques based on the characteristics of the analyzed dataset. It is hoped that this research can improve mitigation and planning for DHF cases in South Tangerang City by choosing the appropriate approach.
##plugins.themes.academic_pro.article.details##

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
[2] M. Sarstedt and E. Mooi, “Regression Analysis,” 2019, pp. 209–256. doi: 10.1007/978-3-662-56707-4_7.
[3] T. Dupré la Tour, M. Eickenberg, A. O. Nunez-Elizalde, and J. L. Gallant, “Feature-space selection with banded ridge regression,” Neuroimage , vol. 264, Dec. 2022, doi: 10.1016/j.neuroimage.2022.119728.
[4] S. Rath, A. Tripathy, and AR Tripathy, “Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression models,” Diabetes and Metabolic Syndrome: Clinical Research and Reviews , vol. 14, no. 5, pp. 1467–1474, Sept. 2020, doi: 10.1016/j.dsx.2020.07.045.
[5] E. Gothai, P. Natesan, RR Rajalaxmi, T. Vignesh, K. Srinithy, and TV Balaji, “Predictive analysis in determining the dissemination of infectious disease and its severity,” in Proceedings - 5th International Conference on Computing Methodologies and Communication, ICCMC 2021 , Institute of Electrical and Electronics Engineers Inc., Apr. 2021, pp. 1556–1562. doi: 10.1109/ICCMC51019.2021.9418228.
[6] SO Olukanmi, FV Nelwamondo, and NI Nwulu, “Utilizing Google Search Data with Deep Learning, Machine Learning and Time Series Modeling to Forecast Influenza-Like Illnesses in South Africa,” IEEE Access , vol. 9, pp. 126822–126836, 2021, doi: 10.1109/ACCESS.2021.3110972.
[7] MI Ullah, M. Aslam, and S. Altaf, “lmridge: A Comprehensive R Package for Ridge Regression.”
[8] A. Bhattacharyya, T. Chakraborty, and S.N. Rai, “Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model,” Nonlinear Dyn , vol. 107, no. 3, pp. 3025–3040, Feb. 2022, doi: 10.1007/s11071-021-07099-3.
[9] T. McAndrew et al. , “Chimeric forecasting: combining probabilistic predictions from computational models and human judgment,” BMC Infect Dis , vol. 22, no. 1, Dec. 2022, doi: 10.1186/s12879-022-07794-5.
[10] M. Arashi, M. Roozbeh, N.A. Hamzah, and M. Gasparini, “Ridge regression and its applications in genetic studies,” PLoS One , vol. 16, no. April 4, Apr. 2021, doi: 10.1371/journal.pone.0245376.
[11] J.T. Lim, E.L.W. Choo, A. Janhavi, K.B. Tan, J. Abisheganaden, and B. Dickens, “Density prediction of conjunctivitis burden using high-dimensional environmental time series data,” Epidemics , vol. 44, Sept. 2023, doi: 10.1016/j.epidem.2023.100694.
[12] CB Aditya Satrio, W. Darmawan, BU Nadia, and N. Hanafiah, "Time series analysis and forecasting of coronavirus disease in Indonesia using ARIMA model and PROPHET," in Procedia Computer Science , Elsevier BV, 2021, pp. 524–532. doi: 10.1016/j.procs.2021.01.036.
[13] MHDM Ribeiro, RG da Silva, VC Mariani, and L. dos S. Coelho, “Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil,” Chaos Solitons Fractals , vol. 135, Jun. 2020, doi: 10.1016/j.chaos.2020.109853.
[14] IN Tanawi, V. Vito, D. Sarwinda, H. Tasman, and GF Hertono, "Support Vector Regression for Predicting the Number of Dengue Incidents in DKI Jakarta," in Procedia Computer Science , Elsevier BV, 2021, pp. 747–753. doi: 10.1016/j.procs.2021.01.063.
[15] A. Dina, “WEB-Based Dengue Fever Outbreak Forecasting Model Using Multiple Linear Regression Method (Case Study of South Tangerang City Health Service),” Bachelor's Thesis, PLN Institute of Technology, 2023.