Optimizing Groundwater Conditioning Parameters for Groundwater Potential Mapping Using Machine Learning Approaches in Klang and Langat River Basin
Syarifah Raihana Syed Zabidi (a), Sharifah Norashikin Bohari (a*), Rizauddin Saian (b), Rohayu Haron Narashid (a)

(a) Faculty of Built Environment, Surveying Science and Geomatics Studies, Universiti Teknologi MARA, Perlis Branch, Arau Campus, Malaysia
*ashikin10[at]uitm.edu.my
(b) Faculty of Computer and Mathematical Sciences, Surveying Science and Geomatics Studies, Universiti Teknologi MARA, Perlis Branch, Arau Campus, Malaysia


Abstract

Groundwater potential (GWP) studies relied heavily on the appropriate selection of parameters. Past studies have considered factors such as topography, hydrology, geology, land cover and climate changes- however, not all variables contribute equally to groundwater occurrence. Therefore, the proper selection of parameters is essential to ensure the accuracy and reliability of GWP prediction. This study aims to optimize 20 GWP conditioning parameters by utilizing several statistical approaches: correlation matrix, multicollinearity and chi-square tests. These parameters represents various factors, including topography, hydrogeology, land cover and also climate changes. The correlation analysis and multicollinearity test results indicate all parameters fall within the required thresholds, showing minimal redundancy and no multicollinearity issues. In contrast, the results from chi-square test indicate that 9 parameters: lineament density, elevation, geology, soil, slope, distance to fault, LULC, NDVI and drainage density exhibit significant contribution (p-value<0.05) and therefore are retained for GWP prediction. These optimized parameters were then applied to predict GWP areas in the Klang and Langat River basins using the random forest (RF) machine learning technique. 564 tubewell points were divided into 70% for training and 30% for testing. The results found that the highest groundwater potential areas were located in the central part of the basins, with a percentage of 14.02%. In contrast, the lowest groundwater potential areas were located in the northern and northeastern areas, with the percentages of 20.58%. The evaluation indicates the model exhibited strong performance, achieving an area under the curve (AUC) value of 0.927 for training and 0.860 for testing. The findings of this study will enhance the accuracy and reliability of groundwater potential mapping that can be used for the future sustainable groundwater management systems.

Keywords: Groundwater potential identification- Optimization- Machine learning- Random Forest

Topic: Topic D: Geospatial Data Integration

ACRS 2025 Conference | Conference Management System