Background Aedes albopictus is the dominant mosquito species in residential areas in Shanghai. There are many types of small containers with accumulated water in residential areas, providing a large number of breeding environments for Aedes alpopicuts and leading to an increasing transmission risk of mosquito-borne diseases.
Objective To use random forest to predict breeding of Aedes mosquitoes in small aquatic container habitat in two concentrated reconstruction communities of rural areas in Shanghai, and to understand associated influence of environmental factors on the breeding of Aedes mosquitoes in the process of urbanization.
Methods Small-scale habitat surveys of Aedes mosquitoes were carried out in two suburb concentrated reconstruction communities (Community A and B) in Shanghai, and the environment where the habitat was located was recorded and analyzed in both communities. The habitat where eggs, larvae, or pupae were found was recorded as positive. Spatial weight matrix was applied on a household basis, and global Moran's I index was used to carry out spatial autocorrelation analysis on the small-scale habitat and positive habitat in the environment of the two communities. When Moran's I is greater than 0, it means that the data present a positive spatial correlation; when Moran's I is less than 0, it means that the data are spatially negatively correlated; when Moran's I is 0, the spatial distribution is random. Combining the results of P and Z values, we explored the spatial distribution characteristics of small-scale habitat and positive habitat in the community environment. Random forest algorithm in machine learning was used to classify and sort environmental-related factors, and predict the breeding of Aedes mosquitoes in small aquatic habitat; receiver operating characteristic (ROC) curve was used to carry out model fitting evaluation.
Results The environmental factors including building location (χ2=23.35, P<0.001), open space (χ2=8.83, P=0.003), and having trees (χ2=11.02, P=0.001) had a significant impact on the positive rate of small-scale habitat. The results of spatial characteristics analysis showed that the global Moran's I index of small-scale habitat was −0.092 (Z=−1.09, P=0.274) in Community A and 0.034 (Z=0.52, P=0.602) in Community B, and the global Moran's I index of positive habitat was −0.092 (Z=−1.14, P=0.255) in Community A and 0.070 (Z=0.95, P=0.342) in Community B. Since the P values of Community A and B were greater than 0.1 and the Z values were between −1.65 and 1.65, for both small-scale habitat and positive habitat the spatial characteristics were randomly distributed and no significant spatial aggregation was found. In the fitted random forest algorithm classification prediction model with the top 10 characteristic factors of importance, the area under curve (AUC) value was 0.95, and the prediction fitting effect was satisfactory. The results of classification and sorting indicated that counts of household small-scale habitat and positive habitat were the most important factors for breeding.
Conclusion The random forest model constructed by environmental factor indicators can be used to predict the breeding situation of Aedes mosquitoes in small-scale aquatic habitat, and provide a basis for scientific prevention and control of mosquito breeding for the target area.