Mapping the Potential Distribution of Raccoon Dog Habitats: Spatial Statistics and Optimized Deep Learning Approaches

Introduction

The raccoon dog, Nyctereutes procyonoides, is a member of the Canidae family that has a widespread distribution across various regions, including Europe (Schwemmer et al., 2021), China (Diao et al., 2022), Vietnam (Van Pham et al., 2023), Japan (Okabe & Agetsuma, 2007), and South Korea (Jeong et al., 2017). The animal exhibits omnivorous food behavior, which includes the consumption of various types of food sources, including small mammals, birds, reptiles, amphibians, insects, fruits, berries, and carrion (Sidorovich et al., 2008; Sutor et al., 2010). Globally, six subspecies of raccoon dog have been identified (Ellerman & Morrison-Scott, 1951), with two subspecies occurring within the geographic area of the Korean Peninsula. Nyctereutes procyonoides koreensis is predominantly found in the southern of the Korean Peninsula (Hong et al., 2013, 2018; Yang et al., 2017), while a fewer number of Nyctereutes procyonoides ussuriensis can be found in the southern and northern Hamgyong-do regions (Won, 1967).

The utilization of geospatial technologies, such as geographic information systems (GIS), is highly advantageous for understanding and investigating the distribution of habitats and their connections with environmental factors (Choi et al., 2011a; Lee & Rezaie, 2021). GIS approaches can be classified as statistical, heuristic or index-based, and artificial intelligence (AI). Statistical approaches that support data analysis include bivariate analysis and multivariate analysis. Both involve the examination of relationships between variables that have distinct characteristics and assist different purposes. The frequency ratio (FR) (Asmare, 2023), spatial principal component analysis (Ruymgaart, 1981), and logistic regression (Choi et al., 2011a; Lee & Sambath, 2006) represent both bivariate and multivariate approaches. In a previous study, FR was applied for habitat mapping of two polychaeta species, Prionospio japonica and Prionospio pulchra, showing prediction accuracies of 77.71% and 74.87%, respectively (Choi et al., 2011b). FR can consider spatial correlation between variables, which can be beneficial when analyzing spatial data (Lee & Pradhan, 2007). The FR approach is very effective in identifying phenomena. The allocation of weights to parameters and decision alternatives for generating distribution maps is accomplished through development methods, including heuristic or index-based approaches (Huang et al., 2020; Yalcin et al., 2011) and AI (Lee & Rezaie, 2021).

The heuristic or index-based approaches have been applied in various scenarios to assist indecision-making processes. Their application involves the integration of various criteria or variables to evaluate and identify disasters (Quarantelli et al., 2007), landslide susceptibility (Althuwaynee et al., 2016), ground subsidence hazard mapping (Park et al., 2012), habitat suitability mapping (Ahmed et al., 2021), and forest management (Henderson & Hoganson, 2021). These methods employ a predetermined set of decision rules or criteria to allocate weights and scores to various environmental variables. The heuristic or index-based approaches were utilized to integrate and prioritize criteria in multi-criteria decision-making for flood susceptibility (Khosravi et al., 2019), the analytical hierarchy process groundwater potential mapping (Arabameri et al., 2018), and fuzzy logic for analyzing the suitability of nesting habitat (Zabihi et al., 2017). This enabled the development of maps through weighted combinations of factors. Fuzzy algorithms are beneficial for dealing with various real-world issues (Murmu & Biswas, 2015). However, the fuzzy logic’s generalization capability is considered inadequate due to its reliance on heuristic algorithms for defuzzification, rule evolution, and antecedent processing (Singh et al., 2012).

Machine learning provides the benefit of adaptability based on specific requirements, including feature selection (Jie et al., 2018), spatial data handling (Du et al., 2020), model interpretability (Li et al., 2022), and model evaluation (Reich & Barai, 1999). The selection of a specific algorithm is based on various factors including the characteristics of the data, the pattern complexity of input data, and the model’s intended application or level of accuracy. The following algorithms have been frequently used for habitat mapping: random forest, support vector machines (Oh et al., 2019), artificial neural networks (ANN) (Lee et al., 2013), gradient boosting (Cai et al., 2014), and clustering algorithms (Barve et al., 2023). In a previous study, the ANN method was applied to estimate the potential habitat distribution of macrobenthos (Macrophthalmus dilatatus, Cerithideopsilla cingulata, and Armandia lanceolate) (Lee et al., 2013). The habitat distribution model was generated with various high-resolution factors, such as the intertidal digital elevation model, slope, aspect, exposure time, channel distance, channel density, sediment distribution, and IKONOS band 4. The validation results indicated that the average prediction accuracies for M. dilatatus, C. cingulata, and A. lanceolata were 74.9%, 78.32%, and 73.27%, respectively (Lee et al., 2013). ANN has emerged as a powerful approach for a wide range of applications, including habitat mapping. However, the challenge of ANN is the determination of an appropriate size and optimal structure for the neural network (Singh et al., 2012).

Deep learning is an influential AI approach that has made remarkable advances in comprehending and processing complex data. Deep learning has been widely recognized as an effective machine learning technique that is controlled, time-efficient, and cost-effective (Dargan et al., 2020). Moreover, achieving optimal results with these models typically entails significant computational resources, extensive labeled datasets, and meticulous model selection and optimization. The most frequently employed deep learning algorithms are recurrent neural networks (Shi et al., 2018), long short-term memory (LSTM) networks, and convolutional neural networks (CNNs) (Lee & Rezaie, 2021). Lee and Rezaie (2021) used CNN and LSTM for mapping potential habitats for Siberian Roe Deer. The results indicated that the predictive performance of both models was similar. However, the LSTM model had higher prediction potential, with a prediction accuracy of 76% for training data and 73% for test data (Lee & Rezaie, 2021). However, the potential for overfitting is a recognized restriction associated with machine learning algorithms, especially in complex models with substantial capacity (Ookura & Mori, 2020). The phenomenon of overfitting presents an obstacle to the algorithm’s capacity to generate precise predictions (Alzubaidi et al., 2021).

Metaheuristic algorithms have made substantial advancements in problem-solving across various fields by offering effective solutions to complex optimization problems that can be challenging for machine and deep learning approaches to address (Rezaie et al., 2022a; Zhang et al., 2022). Sabzi et al. (2021) conducted a comparison of the effectiveness of the harmonic search algorithm and the imperialism competitive algorithm (ICA) in optimizing the hyperparameters of ANN. The findings confirm the superior performance of ICA in enhancing the accuracy and reliability of the outcomes.

The present study establishes a novel approach to habitat distribution modeling designed to reveal the intricate relationships between the distribution of raccoon dog habitat in South Korea and various factors using the FR approach. The main purpose is to establish an accurate distribution map of potential raccoon dog habitat in South Korea by employing the group method of data handling (GMDH), CNN, and LSTM algorithms. Integrating ICA as an optimization algorithm into GMDH, CNN, and LSTM approaches can contribute to optimizing the accuracy and robustness of the model in habitat mapping for raccoon dogs. These maps can be employed to support biodiversity conservation and make progress toward safeguarding the delicate balance between species preservation and human activities.

Materials and Methods

Study area

South Korea is bordered by North Korea to the north, the Yellow Sea to the west, the East Sea (Sea of Japan) to the east, and the Korea Strait to the south. It has a diverse geographical landscape, encompassing a range of topographical features, including mountains, hills, plains, and coastal regions. The western and southern coasts are characterized by numerous bays and inlets, while the eastern coast is more rugged and features steep cliffs. Approximately 70% of the nation’s territory is characterized by the presence of mountains and hills. The Taebaek Mountains run through the eastern borders of the nation, while the Sobaek Mountains are situated in the south. The western and central regions of South Korea exhibit a predominantly flat topography, characterized by fertile landscapes well suited to agricultural activities.

Based on the National Geography Information Institute of South Korea, the geographical characteristics of Korea comprise elevated terrain, mainly found along the eastern coastline, while lower elevations dominate along the western coastline. Consequently, the majority of the rivers in the region flow into the Yellow Sea and the South Sea. The eastern coastline has an identical and continuous stretch, while the rivers that flow into the East Sea have relatively short segments and high slopes. The west coast of South Korea exhibits a complex shoreline characterized by indentations, offshore islands, and deltas. South Korea has five main rivers, namely Nakdonggang, Hangang, Geumgang, Seomjingang, and Yeongsangang. Several rivers that flow towards the western and southern coasts have characteristics like significant length, gentle slopes, and wide-ranging basins, which contribute to substantial discharge volumes.

Over the last 30 years, South Korea’s climate has shown an ongoing increase in temperature. Based on climograph analysis, the average monthly temperatures during the 30-year period from 1981 to 2010 were higher compared with the preceding 30-year period spanning 1971-2000. South Korea’s yearly precipitation increased by 50 mm on average between 1981 and 2010. Precipitation was higher in the summer and lower in the spring and fall in most regions because of the East Asian Monsoon, which caused the summers to become hot and humid and the winters to become cold and dry (Zhisheng et al., 2015).

Dataset for spatial modelling

The National Institute of Ecology (NIE) conducted a comprehensive survey of the raccoon dog distribution with the mission of keeping track of animal populations since 2018. Utilizing a handheld global positioning system, the study tracked the specific positions of raccoon dog habitat. The observations conducted by NIE identified 2,238 locations of raccoon dog habitat thar are illustrated in Fig. 1. The development of a machine learning model and the validation of its performance require using both data obtained from raccoon dog habitats and data collected from non-habitat locations. Therefore, 2,238 points have been randomly selected in areas that have very low potential for raccoon dog habitat.

The habitat and non-habitat locations are randomly divided into training and testing datasets. Specifically, 70% (1,566 points of raccoon dog habitat and 1,566 points of non-raccoon dog habitat) of data is allocated for the training purposes, while the remaining 30% (671 points of raccoon dog habitat and 671 points of non-raccoon dog habitat) is designated for the validation step to compare the predictive ability of the developed models. The extraction of point attributes is performed by overlaying the training and testing datasets with habitat influencing factors.

Factors influencing habitat selection

The method of mapping the distribution of raccoon dog habitat involves considering various environmental factors describing habitat. In previous studies, raccoon dog habitat population showed an increase when the terrain ruggedness index (TRI) was 0-0.4148 m. TRI values of 0-80 m indicate that the terrain used by raccoon dogs is predominantly flat, with minimal changes in elevation over a given distance. The TRI approach refers to investigation and categorization, which aims to establish a measurable assessment of topographic variation (Riley et al., 1999). The investigation of topographic ruggedness has been widely used in several ecological studies as a variable to characterize habitat preference (Beasom et al., 1983; Dilts et al., 2023). Mountainous terrain provides raccoon dogs with natural shelters in the form of rock outcrops, crevices, and caves, which the raccoon dogs may utilize as den sites. The existence of mountains was shown to play a significant role in determining the raccoon dog habitats distribution (Hong et al., 2018). Topographic wetness index (TWI) represents the soil moisture, and equilibrium between catchment water supply and local drainage (Kopecký et al., 2021). In ecological studies, water supply is an aspect intricately connected to wildlife habitats in ecological studies that contributes an essential role in determining their environments and influencing their behavior and survival strategies (Fernald et al., 2012).

Valley depth data can be used to derive preferences of wildlife species, which, in turn, can provide insights into habitat preferences and distribution patterns of wildlife species. According to Marino and Rodríguez (2022) and Traba et al. (2017), deeper valleys are encouraged due to their ability to provide enhanced production and a more concentrated availability of preferred food sources. Raccoon dogs have a preference for habitats located at elevations that are lower than 300 m above sea level. Elevations factors is influenced by the availability of essential resources such as water, food, and suitable shelter in these lower-altitude areas (Melis et al., 2007). Slope and slope height factors influence habitat aspects in a context of livestock-wildlife issues (Marino & Rodríguez, 2022). These factors represents morphometric parameters such as drainage morphometry (Sreedevi et al., 2013), and microclimates (Burnett et al., 2008).

Morphometric characteristics play an essential role to characterize the landscapes, providing significant insights into hydrological processes and environmental conditions (Wilson, 2018). Peaks, ridges, passes, plains, channels, and pits, which have been generated from morphometric characteristics, provide essential components for geological investigations, hydrological assessments, and environmental analysis (Wang et al., 2010). Surface area data is utilized in performing wildlife habitats. This aspect has implications for allowing the migration of wildlife habitats. The wildlife habitat with suitable surface area allow to navigate through varied terrains, encouraging genetic diversity and species survival (Liu et al., 2018).

The presence of water is an essential aspect that significantly influences the distributions and population sizes of animal and plant species. Water availability is crucial in the formation of habitats and the maintenance of the overall well-being and variety of species (Xie et al., 2018). Therefore, the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) are obtained from Sentinel-2 satellite imagery data, which are processed using the green and near-infrared bands to generate NDWI, while the red and near-infrared bands are used to generate NDVI map with a 30-m spatial resolution. NDWI exhibits high sensitivity to changes in hydrological conditions (Talukdar & Pal, 2019) and has been used for indicating climate variables that were potentially utilized as environmental variables within species distribution models (Teng et al., 2021). NDVI has been used to assess the greenness and health of vegetation (Kusuma et al., 2019; Mohanasundaram et al., 2022).

The Ministry of Environment provided a land use/land cover map that is applied to generate drainage density, distance to drainage, and distance to roads maps. Červinka et al. (2015) showed the impact of road on distribution of raccoon habitats. In wildlife habitats across the world, particularly in regions exposed to hunting, roads, and high-traffic volumes, significant changes in animal spatial behavior and distribution occur (Bonnot et al., 2013). The change in habitat caused by drainage, especially in agricultural or urban areas, can have indirect effects on raccoon dogs and other wildlife species. The food resources and shelter options for raccoon dogs can be influenced by changes in water availability, water quality, and the surrounding vegetation resulting from drainage conditions (Lemly, 1994). Furthermore, the installation of drainage systems in agricultural landscapes can lead to transformations in land use and land cover, consequently affecting the accessibility of appropriate habitats (Ahearn et al., 2005). Based on previous studies and obtaining the required data, 14 variables including elevation, slope, valley depth, TWI, TRI, slope height, surface area, LS factor, NDVI, NDWI, distance to drainage, distance to roads, drainage density, and morphometric features are chosen for mapping potential habitats for raccoon dogs (Fig. 2).

Methods

This study applied four methods, including FR, GMDH-ICA, CNN-ICA, and LSTM-ICA are applied to effectively identify the distribution of potential raccoon dog habitat. The accuracy level of model prediction is assessed using the area under the receiver operating characteristic (ROC) curve (AUC). Fig. 3 illustrates the methodological processes employed in this study.

Frequency ratio

Frequency ratio is a bivariate statistical technique used for detecting potential statistical associations between a phenomenon and each associated variable. In this study, the FR values for each category or range of factors are obtained based on their association with the phenomenon (Lee & Talib, 2005), which—in this study—was raccoon dog habitat distribution. In terms of correlation analysis, the FR refers to the proportion of the area where raccoon dog lives in the study area. The FR is calculated by dividing the area of phenomena (raccoon dog habitat distribution) associated with a specific habitat variable subclass by the total study area within the same subclass, as shown in Equation 1 (Huang et al., 2020).

(1)

F R = \frac{N_{(R_{i})} ∕ N_{(F_{i})}}{N_{(R)} ∕ N_{(A)}}

where N_(Ri) represents the pixel of the raccoon dog habitat in the subclass i of the influencing factor; N_(Fi) represents the total pixel of subclass i; N_(R) represents the total raccoon dog habitat distribution of the influencing factor, and N_(A) represents the total area.

A value of 1 represents an average correlation. A value exceeding 1 indicates a strong correlation between raccoon dog habitat potential and a habitat variable, while a value less than 1 indicates a weak correlation.

Group method of data handling

Group method of data handling was developed by Alexey G. Ivakhnenko (1970) in the 1970s and has found applications in various fields, including engineering (Dodangeh et al., 2020), economics (Zhang et al., 2013), and data analysis (Mulashani et al., 2022). The GMDH algorithm implements a self-organization principle to identify the optimal model complexity by systematically evaluating numerous models that fulfil the specified criteria (Ivakhnenko, 1978). The GMDH algorithm consists of multiple functions that effectively handle several issues and enhance the precision of problem-solving outcomes. The functions include linear, polynomial, and ratio-polynomial variations (Ivakhnenko & Ivakhnenko, 2000). The relationship between input and output variables can be described by a complex discrete form of the Volterra functional series, commonly referred to as the Kolmogorov-Gabor polynomial (Farlow, 1984). The model’s input and output variables are linked, as illustrated in Equation 2:

(2)

y = a_{0} + Σ_{i = 1}^{m} a_{i} x_{1} + Σ_{i = 1}^{m} Σ_{j = 1}^{m} a_{1} x_{1} x_{j} + Σ_{1 = 1}^{m} Σ_{J = 1}^{m} Σ_{k_{1}}^{m} a_{1} x_{1} x_{J} x_{2} + \dots

where a is the coefficient calculated using the least squares error approach (Mohebbian et al., 2020); m represents the number of input factor (Tran et al., 2023); y represents the expected result.

Convolutional neural network

Convolutional neural network is a deep learning algorithm that belongs to the broader category of machine learning approaches. Deep learning is a specialized area within the field of machine learning with an emphasis on the utilization of ANNs, which include a layer committed to the convolution operation. The fundamental architecture of the CNN model consists of convolution, pooling, and fully connected layers (Lecun et al., 1998; Yamashita et al., 2018).

The role of the input layer is to receive the raw data and transform it into a numerical format, typically represented as a multi-dimensional array (tensor). The convolutional layer functions is the core part of CNN. The process conducts convolution operations on the input data utilizing a collection of learnable filters, commonly referred to as kernels (Thi Ngo et al., 2021). Following the convolution process, an activation function is applied element-wise to the output of the convolutional layer. The activation function introduces non-linearity into the model, allowing the network to learn complex relationships in the data (Zhang & Wu, 2019).

Pooling helps to reduce computational complexity and improve translation invariance. The pooling layer reduces the spatial dimensions of the data by downsampling the feature maps generated by the convolutional layers (Rezaie et al., 2022b; Zafar et al., 2022). Common pooling operations include max pooling, which selects the maximum value within a small region, and average pooling, which calculates the average value within the region (Yu et al., 2014).

The fully connected layer assists in making predictions based on the high-level features extracted from the previous layers (Panahi et al., 2021). The output layer of a CNN, which generates the final output, typically consists of neurons that correspond to each categorizes.

Long short-term memory

Long short-term memory is another type of deep neural network algorithm which the output of the network is fed back into the network as the subsequent input (Kong et al., 2019). The architectural design of LSTM models demonstrates exceptional proficiency in capturing complex spatial patterns and temporal dynamics within various environments. The LSTM architecture is based on the idea of introducing special memory cells with gating mechanisms, allowing the model to retain and update information over long sequences without losing important information. The key components of an LSTM cell include input gates (i_t), forget gates (f_t), cells ( ${\tilde{C}}_{t}$ ), output gates (o_t) and cell outputs state (C_t) (Graves, 2012). The input gate determines how much of the new information (input data) should be added to the cell state. The forget gate (f_t) is computed using the product of the previous hidden state (h_t–1) and the current input (x_t). The candidate cell state ( ${\tilde{C}}_{t}$ ) is represented in Equation 6 as the new information that could be added to the cell state (C_t) in Equation 5.

(3)

i_{t} = σ (W_{i} [h_{t - 1}, x_{t}] + b_{i})

(4)

f_{t} = σ (W_{f} [h_{t - 1}, x_{t}] + b_{f})

(5)

C_{t} = f_{t} ⨀ C_{t - 1} + i_{t} ⨀ {\tilde{C}}_{t}

(6)

{\tilde{C}}_{t} = \tanh (W_{c} [h_{t - 1}, x_{t}] + b_{c})

The output gate (o_t) determines how much of the updated cell state (C_t) should be exposed as the output of the current time step. By considering the memory cells of the output state (C_t), the computation of the output gate (h_t) values can be performed using Equation 8, as follows:

(7)

o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{o})

(8)

h_{t} = o_{t} ⨀ \tanh (C_{t})

where W (W_f, W_i, W_c, W_o) is a weighted matrix; b (b_f, b_i, b_c, b_o) represents a bias vector, σ is a sigmoid function; x_t is the input to the memory cell layer at time t; ⨀ is the operation of element-wise multiplication; and tanh is a hyperbolic tangent function.

Imperialism competitive algorithm

Imperialism competitive algorithm is a metaheuristic optimization algorithm inspired by the concept of imperialistic competition and the socio-political evolution of empires. It is inspired by the concept of the competitive nature of empires in terms of resources and dominance. In the algorithm, candidate solutions are represented as countries, and the optimization process simulates the interactions between these countries based on their “imperialistic" power and resources (Atashpaz-Gargari & Lucas, 2007). The ICA method has been applied to optimization problems, specifically in solving non-linear equations, for which the optimization technique is more robust and effective (Abdollahi et al., 2013). The algorithm consists of several sections, which represent a potential solution to the problem. These sections aim to achieve the optimum outcome for the specific issue. The steps of ICA are generating initial empires, assimilation, revolution, estimating the total cost of all empires, empire competition, and convergence (Wang et al., 2019). Moreover, the ICA has been effectively used to identify optimal results in certain applications, such as evaluating the quality of fruits and vegetables (Sabzi et al., 2021).

Model evaluation

Area under the curve serves as an effectiveness indicator for evaluating the predictive performance of machine and deep learning algorithms (Bradley, 1997). The AUC values are determined by generating ROC curve and subsequently calculating the area under the curve. Models that demonstrate higher AUC values are regarded to represent higher predictive accuracy. The model’s predictive performance is evaluated across five various ranges of AUC value: fail (0.5-0.6), poor (0.6-0.7), fair (0.7-0.8), good (0.8-0.9), and excellent (0.9-1.0) (Akay, 2021; Zzaman et al., 2021). A value below 0.5 is considered to indicate inconsistency (Swets, 1988). The calculation of AUC is applied to training and testing datasets, which are referred to as success rate and prediction rate, respectively (Arora et al., 2021). The success rate represents the model’s capacity to accurately represent what is observed, while the prediction rate curve demonstrates the model’s effectiveness in generating accurate predictions (Arabameri et al., 2019; Chen et al., 2019).

Results

Effects of habitat characteristics on raccoon dog habitat distribution

The FR model is utilized to assess the correlation between the habitat of the raccoon dog and habitat characteristics. The raccoon dog habitat distribution was analyzed using FR approaches to identify the role of significant habitat variables. As shown in Table 1, TRI has a significant impact on the movement and foraging behavior of raccoon dogs. These animals have been identified to have a wide home range, and rugged terrain might influence their movement patterns. In the TRI factor, classes 0 and 0.01-3.36 have the greatest influence on raccoon dog habitat with FR values of 1.979 and 1.627, respectively. The raccoon dog habitat is found to grow within the slope classes of 0-0.31 and 0.32-9.47 degree, which is shown by the FR value higher than 1. The surface area of raccoon dog habitat is adaptable to a wide range of landscapes in the class category of 900, which showed a higher FR value of 1.975. The relationship between distribution of raccoon dog habitat and TWI can be observed in the class of 12.32-27.22 and with FR value of 2.035. When considering slope height and elevation, the highest FR value is associated with the first class. The highest FR value of the valley depth is found in the last class of 122.99-712.71 m. The drainage density has a significant influence on the raccoon dog habitat, with the highest FR value of 2.176 in the class of 8.37-101.45. The influence of LS factor on raccoon dog habitat is shown by a FR value of 1.923 in class of 0. In terms of distance to roads, the highest FR value is associated with the first class, particularly in the category of very close distances to the road. The distance to drainage demonstrates that the raccoon dog habitat is significantly influenced by specific class of 0-0.01, as indicated by the FR value of 2.397. The vegetation factor is represented by the NDVI that has an influence on raccoon dog habitat with a FR value of 2.093 for the class of 0.34-0.67. The highest FR value for NDWI is 2.155 in the class of 0.22-0.47. The highest FR values for the morphometric features factor, which represents the landscape characteristics, is found to be associated with the class of ridge with the FR value of 4.312.

Map of potential raccoon dog habitat

The generation of a potential raccoon dog habitat map is performed using the FR method, utilizing the following formula:

(9)

P o t e n t i a l m a p o f r a c c o o n d o g h a b i t a t = \sum F R_{i}; i = 1, 2, 3, \dots, N

where FRi referred to the FR value for each factor’s class, N represented the total of influencing variables, and i denoted each factor selected to develop the model.

The map of potential raccoon dog habitat is also generated using GMDH-ICA, CNN-ICA, and LSTM-ICA, and divided into five classes (i.e., very low, low, moderate, high, and very high) using the quantile method (Rezaie et al., 2023). Quantile technique is able to effectively represent the location, variation, and skew distribution of a dataset (Lodder & Hieftje, 1988). Fig. 4 represents the distribution map of raccoon dog habitat using FR, GMDH-ICA, CNN-ICA, and LSTM-ICA. Moreover, Fig. 5 illustrates the percentages of racoon dog habitat in each class of the models.

The evaluation of the predictive accuracy of models is determined by AUC analysis. In the training step, the AUC values for the FR, CNN-ICA, LSTM-ICA, and GMDH-ICA models are 0.775, 0.763, 0.759, and 0.729, respectively (Fig. 6). During the validation step, the FR, CNN-ICA, LSTM-ICA, and GMDH-ICA models achieve AUC values of 0.762, 0.757, 0.754, and 0.727, respectively.

Discussion

The maps of predicted raccoon dog habitat distribution, based on the FR, GMDH-ICA, CNN-ICA, and LSTM-ICA models, all reveal a similar pattern. The distribution of raccoon dog habitat is sparser on the west coast of South Korea. The expansion of their range is primarily attributed to their movement out of their established territory caused by insufficient food resources and disruptions from human activity (Jeong et al., 2017). Raccoon dogs spend a lot of time in wetland areas (during spring and summer) and on the mainland near the sea (Dahl & Åhlén, 2019; Melis et al., 2015). Moreover, raccoon dog habitat has a correlation with vegetation density that is represented by NDVI categorization of 0.34-0.67. The NDVI values between 0.25 and 0.55 indicate prairie, grassland, and farmland, while >0.55 may represent forests and woodland areas (Ghebrezgabher et al., 2020). Prior studies have shown that raccoon dogs residing in rural regions exhibit a preference for forests and grasslands as their primary habitats (Jeong et al., 2017). Raccoon dogs located in rural regions demonstrate a preference for forests and grasslands as their main habitats (Jeong et al., 2017). They tend to be common in wide-open spaces, farmland, along lakeshores, and at low altitudes (<300 m); however, individuals have been found as high as 800 m (Melis et al., 2007). The category drainage distance of 0 m had a high relationship to raccoon dog habitat. The existence of drainage ditches indicates that water bodies have significant characteristics in the raccoon dogs’ natural environment (Süld et al., 2017; Sutor & Schwarz, 2012). Moreover, the medium-sized Canidae have been observed to prefer smaller home ranges near human settlements (Saeki et al., 2007).

The deep learning approach is based on the principle that a data-driven model could be constructed using a variety of interconnected layers of computation (Sarker, 2021). The selection of GMDH, CNN, and LSTM for modeling raccoon dog habitat distribution, coupled with ICA for hyperparameter optimization, enables us to capture nuanced patterns in the raccoon dog habitat distribution data (Tien Bui et al., 2018). These methods offer a well-rounded approach, enhancing accuracy and enabling more precise for raccoon dog habitat mapping. During the validation step, CNN-ICA, LSTM-ICA, and GMDH-ICA models achieve AUC values of 0.757, 0.754, and 0.727, respectively. The elevated AUC values for CNN-ICA and LSTM-ICA, compared with GMDH-ICA methods, can be attributed to their advanced capabilities in capturing intricate patterns within the data. This feature of CNN and LSTM enable them to capture more nuanced patterns, entailing improved discriminatory power and higher predictive accuracy (Lee & Rezaie, 2021), resulting in higher AUC values compared with GMDH methods.

The combination of GMDH, CNN, and LSTM with ICA provides several advantages and addressed challenges in mapping the distribution of raccoon dog habitats. One advantage is the effective use of machine learning and deep learning techniques, enhancing the prediction accuracy of the distribution of raccoon dog habitats. Nevertheless, the algorithms have critical drawbacks. These limitations include the utilization of additional approaches for hyperparameter tuning. Future research can explore and compare different hyperparameter tuning methods, such as the gravity search algorithm, shuffled frog leaping algorithm, or differential evolution, to identify the most efficient approach for optimizing model hyperparameters. Moreover, the selection of suitable factors that influence habitat potential cannot be assisted by the lack of an integrated framework of recommendations. Thus, it is important to devote attention to the factor selection process, which might lead to erroneous outcomes when it has an undue impact. Furthermore, the inclusion of additional spatial data, such as land use and land cover changes, climate data, and human activities, can provide a more comprehensive understanding of raccoon dog habitat preferences and distribution patterns.

Conclusion

The current study has significantly advances our understanding of raccoon dog habitat mapping through using the FR method and optimized machine learning and deep learning approaches (i,e., GMDH-ICA, CNN-ICA, and LSTM-ICA). The model’s predictive ability and robustness is evaluated using the AUC, which confirmed the model’s effectiveness. During the validation step, CNN-ICA outperforms the other models. The integration of geospatial technologies allowed the investigation of complex relationships between raccoon dog habitat and 14 ecological factors. This approach presents a comprehensive perspective on the influence of ecological factors on habitat selection, thereby contributing helpful information for the development of conservation strategies.

Acknowledgement

This research was supported by the Basic Research Project of the Korea Institute of Geoscience and Mineral Resources (KIGAM) and the National Research Foundation of Korea (NRF) grant funded by Korea government (MSIT) (No. 2023R1A2C1003095).

Conflict of interest

The authors declare that they have no competing interests.

References

Abdollahi, M., Isazadeh, A., & Abdollahi, D. (2013) Imperialist competitive algorithm for solving systems of nonlinear equations. Computers & Mathematics with Applications, 65, 1894-1908 .

Ahearn, D.S., Sheibley, R.W., Dahlgren, R.A., Anderson, M., Johnson, J., & Tate, K.W. (2005) Land use and land cover influence on water quality in the last free-flowing river draining the western Sierra Nevada, California Journal of Hydrology, 313, 234-247 .

Ahmed, R., Kumar, P., Rani, M., Kumar, P., Sajjad, H., Chaudhary, B.S., Rawat, J.S., & Rani, M. (Eds.) (2021) Remote Sensing and GIScience Springer Introduction to challenges and future directions in remote sensing and GIScience, pp. 3-7

Akay, H. (2021) Flood hazards susceptibility mapping using statistical, fuzzy logic, and MCDM methods Soft Computing, 25, 9325-9346 .

Althuwaynee, O.F., Pradhan, B., & Lee, S. (2016) A novel integrated model for assessing landslide susceptibility mapping using CHAID and AHP pair-wise comparison International Journal of Remote Sensing, 37, 1190-1209 .

Alzubaidi, L., Zhang, J., Humaidi, A.J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., et al. (2021) . Article Id (pmcid)

Arabameri, A., Rezaei, K., Cerda, A., Lombardo, L., & Rodrigo-Comino, J. (2019) GIS-based groundwater potential mapping in Shahroud plain, Iran. A comparison among statistical (bivariate and multivariate), data mining and MCDM approaches Science of The Total Environment, 658, 160-177 .

Arabameri, A., Rezaei, K., Pourghasemi, H.R., Lee, S., & Yamani, M. (2018) .

Arora, A., Arabameri, A., Pandey, M., Siddiqui, M.A., Shukla, U.K., Bui, D.T., et al. (2021) Optimization of state-of-the-art fuzzy-metaheuristic ANFIS-based machine learning models for flood susceptibility prediction mapping in the Middle Ganga Plain, India Science of The Total Environment, 750, 141565 .

Asmare, D. (2023) Application and validation of AHP and FR methods for landslide susceptibility mapping around Choke Mountain, Northwestern Ethiopia Scientific African, 19, e01470 .

Atashpaz-Gargari, E., & Lucas, C. (2007) Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition Paper presented at 2007 IEEE Congress on Evolutionary Computation Singapore, Singapore.: .

Barve, S., Webster, J.M., & Chandra, R. (2023) Reef-insight: a framework for reef habitat mapping with clustering methods using remote sensing Information, 14, 373 .

Beasom, S.L., Wiggers, E.P., & Giardino, J.R. (1983) A technique for assessing land surface ruggedness The Journal of Wildlife Management, 47, 1163-1166 .

Bonnot, N., Morellet, N., Verheyden, H., Cargnelutti, B., Lourtet, B., Klein, F., et al. (2013) Habitat use under predation risk: hunting, roads and human dwellings influence the spatial behaviour of roe deer. European Journal of Wildlife Research, 185-193 .

Bradley, A.P. (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recognition, 30, 1145-1159 .

Burnett, B.N., Meyer, G.A., & McFadden, L.D. (2008) Aspect-related microclimatic influences on slope forms and processes, northeastern Arizona Journal of Geophysical Research, 113, F03002 .

Cai, T., Huettmann, F., & Guo, Y. (2014) Using stochastic gradient boosting to infer stopover habitat selection and distribution of Hooded Cranes Grus monacha during spring migration in Lindian, Northeast China PLoS One, 9, e89913 . Article Id (pmcid)

Červinka, J., Riegert, J., Grill, S., & Šálek, M. (2015) Large-scale evaluation of carnivore road mortality: the effect of landscape and local scale characteristics Mammal Research, 60, 233-243 .

Chen, W., Panahi, M., Tsangaratos, P., Shahabi, H., Ilia, I., Panahi, S., et al. (2019) Applying population-based evolutionary algorithms and a neuro-fuzzy system for modeling landslide susceptibility Catena, 172, 212-231 .

Choi, J.K., Oh, H.J., Koo, B.J., Ryu, J.H., & Lee, S. (2011a) Crustacean habitat potential mapping in a tidal flat using remote sensing and GIS Ecological Modelling, 222, 1522-1533 .

Choi, J.K., Oh, H.J., Koo, B.J., Ryu, J.H., & Lee, S. (2011b) Spatial polychaeta habitat potential mapping using probabilistic models Estuarine, Coastal and Shelf Science, 93, 98-105 .

Dahl, F., & Åhlén, P.A. (2019) Nest predation by raccoon dog Nyctereutes procyonoides in the archipelago of northern Sweden Biological Invasions, 21, 743-755 .

Dargan, S., Kumar, M., Ayyagari, M.R., & Kumar, G. (2020) A survey of deep learning and its applications: a new paradigm to machine learning Archives of Computational Methods in Engineering, 27, 1071-1092 .

Diao, Y., Zhao, Q., Weng, Y., Huang, Z., Wu, Y., Gu, B., et al. (2022) Predicting current and future species distribution of the raccoon dog (Nyctereutes procyonoides) in Shanghai, China Landscape and Urban Planning, 228, 104581 .

Dilts, T.E., Blum, M.E., Shoemaker, K.T., Weisberg, P.J., & Stewart, K.M. (2023) Improved topographic ruggedness indices more accurately model fine-scale ecological patterns Landscape Ecology, 38, 1395-1410 .

Dodangeh, E., Panahi, M., Rezaie, F., Lee, S., Bui, DT., & Lee, C.W. (2020) Novel hybrid intelligence models for flood-susceptibility prediction: meta optimization of the GMDH and SVR models with the genetic algorithm and harmony search Journal of Hydrology, 590, 125423 .

Du, P., Bai, X., Tan, K., Xue, Z., Samat, A., Xia, J., et al. (2020) .

Ellerman, J.R., & Morrison-Scott, T.C.S. (1951) Checklist of Palaearctic and Indian Mammals, 1758-1946 Trustees of the British Museum:

Farlow, S.J. (1984) Self-Organizing Methods in Modeling: GMDH-Type Algorithms Marcel Dekker:

Fernald, A., Tidwell, V., Rivera, J., Rodríguez, S., Guldan, S., Steele, C., et al. (2012) Modeling sustainability of water, environment, livelihood, and culture in traditional irrigation communities and their linked watersheds Sustainability, 4, 2998-3022 .

Ghebrezgabher, M.G., Yang, T., Yang, X., & Sereke, T.E. (2020) Assessment of NDVI variations in responses to climate change in the Horn of Africa The Egyptian Journal of Remote Sensing and Space Science, 23, 249-261 .

Graves, A., & Graves, A. (Ed.) (2012) Supervised Sequence Labelling with Recurrent Neural Networks Springer Long short-term memory, pp. 37-45

Henderson, E.B., & Hoganson, H.M. (2021) A learning heuristic for integrating spatial and temporal detail in forest planning Natural Resource Modeling, 34, e12299 .

Hong, Y., Kim, K.S., Lee, H., & Min, M.S. (2013) Population genetic study of the raccoon dog (Nyctereutes procyonoides) in South Korea using newly developed 12 microsatellite markers Genes & genetic systems, 88, 69-76 .

Hong, Y.J., Kim, K.S., Min, M.S., & Lee, H. (2018) Population structure of the raccoon dog (Nyctereutes procyonoides) using microsatellite loci analysis in South Korea: implications for disease management The Journal of Veterinary Medical Science, 80, 1631-1638 . Article Id (pmcid)

Huang, F., Cao, Z., Guo, J., Jiang, S.H., Li, S., & Guo, Z. (2020) Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping Catena, 191, 104580 .

Ismail, A.A., Wood, T., & Bravo, H.C. (2018) Improving long-horizon forecasts with expectation-biased LSTM networks arXiv, 1804.06776 .

Ivakhnenko, A.G., & Ivakhnenko, G.A. (2000) Problems of further development of the group method of data handling algorithms. Part I Pattern Recognition and Image Analysis, 10, 187-194 .

Ivakhnenko, A.G. (1970) Heuristic self-organization in problems of engineering cybernetics Automatica, 6, 207-219 .

Ivakhnenko, A.G. (1978) The group method of data handling in long-range forecasting Technological Forecasting and Social Change, 12, 213-227 .

Jeong, W., Kim, D.H., Yoon, H., Kim, H.J., Kang, Y.M., Moon, O.K., et al. (2017) Home range differences by habitat type of raccoon dogs Nyctereutes procyonoides (Carnivora: Canidae) Journal of Asia-Pacific Biodiversity, 10, 349-354 .

Jie, C., Jiawei, L., Shulin, W., & Sheng, Y. (2018) Feature selection in machine learning: a new perspective Neurocomputing, 300, 70-79 https://doi.org/10.1016/j.neucom.2017.11.077.

Khosravi, K., Shahabi, H., Pham, B.T., Adamowski, J., Shirzadi, A., Pradhan, B., et al. (2019) A comparative assessment of flood susceptibility modeling using Multi-Criteria Decision-Making Analysis and machine learning methods Journal of Hydrology, 573, 311-323 https://doi.org/10.1016/j.jhydrol.2019.03.073.

Kong, W., Dong, Z.Y., Jia, Y., Hill, D.J., Xu, Y., & Zhang, Y. (2019) Short-term residential load forecasting based on LSTM recurrent neural network IEEE Transactions on Smart Grid, 10, 841-851 https://doi.org/10.1109/TSG.2017.2753802.

Kopecký, M., Macek, M., & Wild, J. (2021) Topographic Wetness Index calculation guidelines based on measured soil moisture and plant species composition Science of The Total Environment, 757, 143785 https://doi.org/10.1016/j.scitotenv.2020.143785.

Kusuma, W.L., Chih-Da, W., Yu-Ting, Z., Hapsari, H.H., & Muhamad, J.L. (2019) PM2.5 pollutant in Asia-a comparison of metropolis cities in Indonesia and Taiwan International Journal of Environmental Research and Public Health, 16, 4924 . Article Id (pmcid)

Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998) Gradient-based learning applied to document recognition Proceedings of the IEEE, 86, 2278-2324 .

Lee, S., & Pradhan, B. (2007) Landslide hazard mapping at Selangor, Malaysia using frequency ratio and logistic regression models Landslides, 4, 33-41 .

Lee, S., & Rezaie, F. (2021) Application of statistical and machine learning techniques for habitat potential mapping of Siberian roe deer in South Korea Proceedings of the National Institute of Ecology of the Republic of Korea, 2, 1-14 .

Lee, S., & Sambath, T. (2006) Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models Environmental Geology, 50, 847-855 .

Lee, S., & Talib, J.A. (2005) Probabilistic landslide susceptibility and factor effect analysis Environmental Geology, 47, 982-990 .

Lee, S., Park, I., Koo, B.J., Ryu, J.H., Choi, J.K., & Woo, H.J. (2013) Macrobenthos habitat potential mapping using GIS-based artificial neural network models Marine Pollution Bulletin, 67, 177-186 .

Lemly, A.D. (1994) Agriculture and wildlife: ecological implications of subsurface irrigation drainage Journal of Arid Environments, 28, 85-94 .

Li, X., Xiong, H., Li, X., Wu, X., Zhang, X., Liu, J., et al. (2022) Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond Knowledge and Information Systems, 64, 3197-3234 .

Liu, C., Newell, G., White, M., & Bennett, A.F. (2018) Identifying wildlife corridors for the restoration of regional habitat connectivity: a multispecies approach and comparison of resistance surfaces PLoS One, 13, e0206071 . Article Id (pmcid)

Lodder, R.A., & Hieftje, G.M. (1988) Quantile analysis: a method for characterizing data distributions Applied Spectroscopy, 42, 1512-1520 .

Marino, A., & Rodríguez, V. (2022) Competitive exclusion and herbivore management in a context of livestock-wildlife conflict Austral Ecology, 47, 1208-1221 .

Melis, C., Herfindal, I., Dahl, F., & Åhlén, P.A. (2015) Individual and temporal variation in habitat association of an alien carnivore at its invasion front PLoS One, 10, e0122492 . Article Id (pmcid)

Melis, C., Nordgård, H., Herfindal, I., Kauhala, K., Åhlen, PA., Strann, K.B., et al. (2007) Raccoon Dogs in Norway - Potential Expansion Rate, Distribution Area and Management Implications Norges Teknisk-Naturvitenskapelige Universitet

Mohanasundaram, S., Baghel, T., Thakur, V., Udmale, P., & Shrestha, S. (2022) .

Mohebbian, M.R., Dinh, A., Wahid, K., & Alam, M.S. (2020) Blind, cuff-less, calibration-free and continuous blood pressure estimation using optimized inductive group method of data handling Biomedical Signal Processing and Control, 57, 101682 .

Mulashani, A.K., Shen, C., Nkurlu, B.M., Mkono, C.N., & Kawamala, M. (2022) Enhanced group method of data handling (GMDH) for permeability prediction based on the modified Levenberg Marquardt technique from well log data Energy, 239, 121915 .

Murmu, S., & Biswas, S. (2015) Application of fuzzy logic and neural network in crop classification: a review Aquatic Procedia, 4, 1203-1210 .

Oh, H.J., Syifa, M., Lee, C.W., & Lee, S. (2019) Ruditapes philippinarum habitat mapping potential using SVM and Naïve Bayes Journal of Coastal Research, 90, 41-48 .

Okabe, F., & Agetsuma, N. (2007) Habitat use by introduced raccoons and native raccoon dogs in a deciduous forest of Japan Journal of Mammalogy, 88, 1090-1097 .

Ookura, S., & Mori, H. (2020) An efficient method for wind power generation forecasting by LSTM in consideration of overfitting prevention IFAC-PapersOnLine, 53, 12169-12174 .

Panahi, M., Khosravi, K., Ahmad, S., Panahi, S., Heddam, S., Melesse, A.M., et al. (2021) Cumulative infiltration and infiltration rate prediction using optimized deep learning algorithms: a study in Western Iran. Journal of Hydrology: Regional Studies, 35, 100825 .

Park, I., Choi, J., Lee, M.J., & Lee, S. (2012) Application of an adaptive neuro-fuzzy inference system to ground subsidence hazard mapping Computers & Geosciences, 48, 228-238 .

Quarantelli, E.L., Lagadec, P., Boin, A., Rodríguez, H., Quarantelli, E.L., & Dynes, R.R. (Eds.) (2007) Handbook of Disaster Research Springer A heuristic approach to future disasters and crises: new, old, and in-between types, pp. 16-41 Article Id (pmcid)

Reich, Y., & Barai, S.V. (1999) Evaluating machine learning models for engineering problems Artificial Intelligence in Engineering, 13, 257-272 .

Rezaie, F., Panahi, M., Bateni, S.M., Jun, C., Neale, C.M.U., & Lee, S. (2022a) Novel hybrid models by coupling support vector regression (SVR) with meta-heuristic algorithms (WOA and GWO) for flood susceptibility mapping Natural Hazards, 114, 1247-1283 .

Rezaie, F., Panahi, M., Bateni, S.M., Kim, S., Lee, J., Lee, J., et al. (2023) Spatial modeling of geogenic indoor radon distribution in Chungcheongnam-do, South Korea using enhanced machine learning algorithms Environment international, 171, 107724 .

Rezaie, F., Panahi, M., Lee, J., Lee, J., Kim, S., Yoo, J., et al. (2022b) Radon potential mapping in Jangsu-gun, South Korea using probabilistic and deep learning algorithms Environmental pollution (Barking, Essex : 1987), 292(Pt B), 118385 .

Riley, S.J., DeGloria, S.D., & Elliot, R. (1999) A terrain ruggedness index that quantifies topographic heterogeneity Intermountain Journal of Sciences, 5, 23-27 .

Ruymgaart, F.H. (1981) A robust principal component analysis Journal of Multivariate Analysis, 11, 485-497 .

Sabzi, S., Pourdarbani, R., Rohban, M.H., Fuentes-Penna, A., Hernández-Hernández, J.L., & Hernández-Hernández, M. (2021) Classification of cucumber leaves based on nitrogen content using the hyperspectral imaging technique and majority voting Plants (Basel, Switzerland), 10, 898 . Article Id (pmcid)

Saeki, M., Johnson, P.J., & Macdonald, D.W. (2007) Movements and habitat selection of raccoon dogs (Nyctereutes procyonoides) in a mosaic landscape Journal of Mammalogy, 88, 1098-1111 .

Sarker, I.H. (2021) . Article Id (pmcid)

Schwemmer, P., Weiel, S., & Garthe, S. (2021) .

Shi, H., Xu, M., & Li, R. (2018) Deep learning for household load forecasting - a novel pooling deep RNN IEEE Transactions on Smart Grid, 9, 5271-5280 .

Sidorovich, V.E., Solovej, I.A., Sidorovich, A.A., & Dyman, A.A. (2008) Seasonal and annual variation in the diet of the raccoon dog Nyctereutes procyonoides in northern Belarus: the role of habitat type and family group Acta Theriologica, 53, 27-38 .

Singh, R., Kainthola, A., & Singh, T.N. (2012) Estimation of elastic constant of rocks using an ANFIS approach Applied Soft Computing, 12, 40-45 .

Sreedevi, P.D., Sreekanth, P.D., Khan, H.H., & Ahmed, S. (2013) Drainage morphometry and its influence on hydrology in an semi arid region: using SRTM data and GIS Environmental Earth Sciences, 70, 839-848 .

Süld, K., Saarma, U., & Valdmann, H. (2017) Home ranges of raccoon dogs in managed and natural areas PLoS One, 12, e0171805 . Article Id (pmcid)

Sutor, A., & Schwarz, S. (2012) Home ranges of raccoon dogs (Nyctereutes procyonoides, Gray, 1834) in Southern Brandenburg, Germany European Journal of Wildlife Research, 69, 85-97 .

Sutor, A., Kauhala, K., & Ansorge, H. (2010) Diet of the raccoon dog Nyctereutes procyonoides - a canid with an opportunistic foraging strategy Acta Theriologica, 55, 165-176 .

Swets, J.A. (1988) Measuring the accuracy of diagnostic systems Science (New York, N.Y.), 240, 1285-1293 .

Talukdar, S., & Pal, S. (2019) Effects of damming on the hydrological regime of Punarbhaba river basin wetlands Ecological Engineering, 135, 61-74 .

Teng, J., Xia, S., Liu, Y., Yu, X., Duan, H., Xiao, H., et al. (2021) Assessing habitat suitability for wintering geese by using Normalized Difference Water Index (NDWI) in a large floodplain wetland, China Ecological Indicators, 122, 107260 .

Thi Ngo, P.T., Panahi, M., Khosravi, K., Ghorbanzadeh, O., Kariminejad, N., Cerda, A., et al. (2021) Evaluation of deep learning algorithms for national scale landslide susceptibility mapping of Iran Geoscience Frontiers, 12, 505-519 .

Tien Bui, D., Shahabi, H., Shirzadi, A., Chapi, K., Hoang, N.D., Pham, BT., et al. (2018) A novel integrated approach of relevance vector machine optimized by imperialist competitive algorithm for spatial modeling of shallow landslides Remote Sensing, 10, 1538 .

Traba, J., Iranzo, E.C., Carmona, C.P., & Malo, J.E. (2017) Realised niche changes in a native herbivore assemblage associated with the presence of livestock Oikos, 126, 1400-1409 .

Tran, T.T.K., Bateni, S.M., Rezaie, F., Panahi, M., Jun, C., Trauernicht, C., et al. (2023) Enhancing predictive ability of optimized group method of data handling (GMDH) method for wildfire susceptibility mapping Agricultural and Forest Meteorology, 339, 109587 .

Van Pham, T., Trinh, M.T., Gray, R.J., Cao, L.N., Van Nguyen, T., Van Nguyen, M., et al. (2023) .

Wang, D., Laffan, S.W., Liu, Y., & Wu, L. (2010) Morphometric characterisation of landform from DEMs International Journal of Geographical Information Science, 24, 305-326 .

Wang, Y., Hong, H., Chen, W., Li, S., Panahi, M., Khosravi, K., et al. (2019) Flood susceptibility mapping in Dingnan County (China) using adaptive neuro-fuzzy inference system with biogeography based optimization and imperialistic competitive algorithm Journal of environmental management, 247, 712-729 .

Wilson, J.P. (2018) Environmental Applications of Digital Terrain Modeling John Wiley & Sons

Won, P.H. (1967) Illustrated Encyclopedia of Fauna and Flora of Korea, Volume 7 Mammals: Samhwabook .

Xie, Y., Yu, X., Ng, N.C., Li, K., & Fang, L. (2018) Exploring the dynamic correlation of landscape composition and habitat fragmentation with surface water quality in the Shenzhen river and deep bay cross-border watershed, China Ecological Indicators, 90, 231-246 .

100

Yalcin, A., Reis, S., Aydinoglu, A.C., & Yomralioglu, T. (2011) A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey Catena, 85, 274-287 .

101

Yamashita, R., Nishio, M., Do, R.K.G., & Togashi, K. (2018) Convolutional neural networks: an overview and application in radiology Insights into Imaging, 9, 611-629 . Article Id (pmcid)

102

Yang, D.K., Lee, S.H., Kim, H.H., Kim, J.T., Ahn, S., & Cho, I.S. (2017) Detection of viral infections in wild Korean raccoon dogs (Nyctereutes procyonoides koreensis) Korean Journal of Veterinary Research, 57, 209-214 .

103

Yu, D., Wang, H., Chen, P., Wei, Z., Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., & Wang, R. (Eds.) (2014) Rough Sets and Knowledge Technology Cham: Springer Mixed pooling for convolutional neural networks, pp. 364-374

104

Zabihi, K., Paige, G.B., Hild, A.L., Miller, S.N., Wuenschel, A., & Holloran, M.J. (2017) A fuzzy logic approach to analyse the suitability of nesting habitat for greater sage-grouse in western Wyoming, Journal of Spatial Science, 62, 215-234 .

105

Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., et al. (2022) A comparison of pooling methods for convolutional neural networks Applied Sciences, 12, 8643 .

106

Zhang, C.L., & Wu, J. (2019) Improving CNN linear layers with power mean non-linearity Pattern Recognition, 89, 12-21 .

107

Zhang, M., He, C., Gu, X., Liatsis, P., & Zhu, B. (2013) D-GMDH: a novel inductive modelling approach in the forecasting of the industrial economy Economic Modelling, 30, 514-520 .

108

Zhang, Z., Gong, J., Liu, J., & Chen, F. (2022) A fast two-stage hybrid meta-heuristic algorithm for robust corridor allocation problem Advanced Engineering Informatics, 53, 101700 .

109

Zhisheng, A., Guoxiong, W., Jianping, L., Youbin, S., Yimin, L., Weijian, Z., et al. (2015) Global monsoon dynamics and climate change Annual review of earth and planetary sciences, 43, 29-77 .

110

Zzaman, R.U., Nowreen, S., Billah, M., & Islam, A.S. (2021) Flood hazard mapping of Sangu River basin in Bangladesh using multi-criteria analysis of hydro-geomorphological factors Journal of Flood Risk Management, 14, e12715 .

Figures and Tables

Fig. 1

Study area and data regarding to the distribution of raccoon dog habitats.

Fig. 2

Influencing factors considered for mapping potential raccoon dog habitat: (A) elevation, (B) slope, (C) valley depth, (D) TWI, (E) TRI, (F) slope height, (G) surface area, (H) LS factor, (I) NDVI, (J) NDWI, (K) distance to drainage, (L) drainage density, (M) distance to roads, (N) morphometric features. TWI, topographic wetness index; TRI, terrain ruggedness index; LS factor, slope length and steepness factor; NDVI, normalized difference vegetation index; NDWI, normalized difference water index.

Fig. 3

Flowchart of the developed methodology for detecting potential habitat of raccoon dog. TWI, topographic wetness index; TRI, terrain ruggedness index; LS factor, slope length and steepness factor; NDVI, normalized difference vegetation index; NDWI, normalized difference water index; FR, frequency ratio; GMDH, group method of data handling; CNN, convolutional neural network; LSTM, long short-term memory; ICA, imperialist competitive algorithm; AUC, area under the curve.

Fig. 4

Raccoon dog habitat using (A) FR (B) CNN-ICA, (C) LSTM-ICA, and (D) GMDH-ICA models. FR, frequency ratio; CNN, convolutional neural network; ICA, imperialist competitive algorithm; LSTM, long short-term memory; GMDH, group method of data handling.

Fig. 5

Comparing the predictive performance of models using (A) training and (B) testing datasets. FR, frequency ratio; AUC, area under the curve; CNN, convolutional neural network; ICA, imperialist competitive algorithm; LSTM, long short-term memory; GMDH, group method of data handling.

Fig. 6

Percentage of raccoon dog habitat locations within each class of habitat potential map. FR, frequency ratio; GMDH, group method of data handling; ICA, imperialist competitive algorithm; CNN, convolutional neural network; LSTM, long short-term memory.

Table 1

Spatial relationship between influencing factors and habitat distributions using FR model

Factor	Classes	Percentage of pixels	Percentage of habitats	FR
Elevation (m)	0-56	0.201	0.350	1.737
	56.1-135	0.201	0.226	1.123
	135.1-246	0.200	0.213	1.062
	246.1-433	0.199	0.133	0.667
	433.1-1,900	0.198	0.079	0.399
Slope (degree)	0-0.31	0.217	0.429	1.975
	0.32-9.47	0.197	0.322	1.634
	9.48-17.11	0.195	0.175	0.895
	17.12-24.44	0.196	0.051	0.260
	24.45-77.89	0.194	0.023	0.118
Valley depth (m)	0-13.97	0.192	0.090	0.468
	13.98-33.54	0.211	0.088	0.419
	33.55-64.28	0.212	0.194	0.915
	64.29-122.98	0.189	0.255	1.352
	122.99-712.71	0.198	0.373	1.891
TWI	1.87-4.95	0.171	0.020	0.115
	4.96-5.84	0.208	0.075	0.358
	5.85-7.34	0.220	0.167	0.757
	7.35-12.31	0.202	0.334	1.657
	12.32-27.22	0.199	0.405	2.035
TRI	0	0.216	0.426	1.979
	0.01-3.36	0.197	0.320	1.627
	3.37-5.97	0.197	0.169	0.860
	5.98-8.95	0.195	0.057	0.291
	8.96-95.12	0.195	0.027	0.137
Slope height (m)	0-2.81	0.143	0.335	2.334
	2.82-8.43	0.223	0.337	1.513
	8.44-19.67	0.286	0.242	0.847
	19.68-50.59	0.185	0.056	0.304
	50.60-716.71	0.163	0.029	0.181
Surface area	900	0.217	0.429	1.975
	900.01-926.59	0.197	0.322	1.634
	926.60-953.18	0.195	0.175	0.895
	953.19-1,006.36	0.196	0.051	0.260
	1,006.37-4,290.24	0.194	0.023	0.118
LS factor	0	0.197	0.379	1.923
	0.01-7.22	0.236	0.334	1.419
	7.23-14.44	0.199	0.133	0.667
	14.45-22.86	0.193	0.078	0.403
	22.87-306.77	0.175	0.076	0.433
NDVI	–1-0.33	0.196	0.284	1.449
	0.34-0.67	0.199	0.416	2.093
	0.68-0.76	0.181	0.162	0.895
	0.77-0.81	0.186	0.074	0.398
	0.82-1	0.239	0.065	0.271
NDWI	–1-0.21	0.193	0.264	1.363
	0.22-0.47	0.196	0.423	2.155
	0.48-0.54	0.177	0.175	0.985
	0.55-0.62	0.194	0.062	0.321
	0.63-1	0.239	0.077	0.320
Distance to drainage (m)	0-0.01	0.174	0.418	2.397
	0.02-65.55	0.208	0.267	1.283
	65.56-131.11	0.212	0.165	0.776
	131.12-262.21	0.204	0.090	0.439
	262.22-16,716.06	0.201	0.061	0.303
Distance to roads (m)	0-0.01	0.209	0.462	2.213
	0.02-28.45	0.244	0.373	1.526
	28.46-99.58	0.192	0.101	0.524
	99.59-241.83	0.179	0.044	0.244
	241.84-3,627.52	0.175	0.020	0.116
Drainage density	0-0.01	0.191	0.056	0.295
	0.02-2.79	0.213	0.103	0.482
	2.80-5.17	0.215	0.192	0.892
	5.18-8.36	0.200	0.255	1.275
	8.37-101.45	0.181	0.394	2.176
Morphometric features	Peak	0.225	0.334	1.483
	Ridge	0.001	0.006	4.312
	Pass	0.357	0.443	1.240
	Plan	0.004	0.005	1.339
	Channel	0.411	0.211	0.513
	Pit	0.0009	0.0007	0.6937

[i]

FR, frequency ratio; TWI, topographic wetness index; TRI, terrain ruggedness index; LS factor, slope length and steepness factor; NDVI, normalized difference vegetation index; NDWI, normalized difference water index.