Nowcasting Population using Support Vector Regression (SVR) and Multi-Output Support Vector Regression (M-SVR)
Riyan Zulmaniar Vinahari (a*,b), Hidayatul Khusna (a), Heri Kuswanto (a)

a) Department of Statistics, Institut Teknologi Sepuluh Nopember, Kampus ITS - Sukolilo, Surabaya 60111, Indonesia
*riyanzv[at]bps.go.id
b) Badan Pusat Statistik Kabupaten Kendal, Kendal 51351, Indonesia


Abstract

Abstract. DKI Jakarta, West Java, Central Java, and East Java population on a yearly basis is an important data for planning and evaluation in medium- and long-term national development. These data cannot be provided through the population registration system. In addition, Badan Pusat Statistik (BPS-Statistics Indonesia) only provides the population data on a regular basis for each five-year period. Therefore, the population projection is required to provide the total population on a yearly basis. The common method used in population projection is the cohort component method (CCM). CCM is widely used in many countries, including Indonesia. BPS, as an official government institution in Indonesia, also uses CCM to estimate the Indonesian population. Unfortunately, this method has several drawbacks with less accuracy, and therefore, a more accurate method is required. One of the promising methods is nowcasting, which predicts the current value based on a variety of socioeconomic and macroeconomic variables with high frequency data over a yearly period. Due to technological advancements, time series analysis can now be conducted not only using classical statistical methods but also machine learning (ML). In this work, machine learning with a two different nowcasting methods were evaluated. Support Vector Regression (SVR) and Multi-output Support Vector Regression (MSVR) were applied and compared to predict yearly population nowcasts in DKI Jakarta, West Java, Central Java, and East Java Province. The performance comparison between SVR and MSVR is evaluated based on the Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The data used in this work were obtained from BPS, with the output variable being the population of several provinces on Java Island, such as DKI Jakarta, West Java, Central Java, and East Java. The results show that based on the simulation, MSVR outperforms SVR, as shown by a smaller RMSE and MAPE. This indicates that MSVR can be a powerful machine learning-based nowcasting method to be used in population projection in DKI Jakarta, West Java, Central Java, and East Java.

Keywords: Machine Learning, Multioutput Support Vector Regression, Nowcasting, Population, Support Vector Regression

Topic: Mathematics and Statistics

ICoSMEE 2023 Conference | Conference Management System