王亚, 应佳丽, 杨琛, 林涛, 金克峙. 应用聚类分析识别上海浦东新区道路交通事故模式[J]. 环境与职业医学, 2018, 35(12): 1106-1113. DOI: 10.13213/j.cnki.jeom.2018.18412
引用本文: 王亚, 应佳丽, 杨琛, 林涛, 金克峙. 应用聚类分析识别上海浦东新区道路交通事故模式[J]. 环境与职业医学, 2018, 35(12): 1106-1113. DOI: 10.13213/j.cnki.jeom.2018.18412
WANG Ya, YING Jia-li, YANG Chen, LIN Tao, JIN Ke-zhi. Application of cluster analysis in pattern recognition of road traffic crashes in Pudong New Area of Shanghai[J]. Journal of Environmental and Occupational Medicine, 2018, 35(12): 1106-1113. DOI: 10.13213/j.cnki.jeom.2018.18412
Citation: WANG Ya, YING Jia-li, YANG Chen, LIN Tao, JIN Ke-zhi. Application of cluster analysis in pattern recognition of road traffic crashes in Pudong New Area of Shanghai[J]. Journal of Environmental and Occupational Medicine, 2018, 35(12): 1106-1113. DOI: 10.13213/j.cnki.jeom.2018.18412

应用聚类分析识别上海浦东新区道路交通事故模式

Application of cluster analysis in pattern recognition of road traffic crashes in Pudong New Area of Shanghai

  • 摘要: 目的 应用聚类分析方法对道路交通事故进行分类,识别不同事故的发生模式,为制定适应不同事故模式的干预措施提供依据。

    方法 从上海市浦东新区交警事故处理记录数据库中调取自2010年1月1日—2016年12月31日期间共3 135起事故主要责任人信息。选取年龄、性别、时间、季度、天气、道路类型、路口路段、交通方式、事故原因9个变量作为分析变量,分别采取潜类别分析与系统聚类两种方法对交通事故进行聚类,分析聚类结果与伤害结局。

    结果 潜类别分析可识别更多事故模式类别,优于传统系统聚类方法,潜类别分析聚类结果将事故发生模式分为6类,分别命名为“青中年机动车公路组”“青中年客车一般道路组”“青中年夜间摩托客车无证酒驾组”“中老年电动车自行车组”“中老年早晚步行组”“青中年深夜机动车组”。各类别间伤害结局存在统计学差异(χ2=1 492.492,P < 0.05),且伤害结局与事故分类具有相关关系(r=0.568,P < 0.05)。各类别对健康结局的贡献以“中老年早晚步行组”最大,“中老年电动车自行车组”次之,“青中年客车一般道路组”最小。对比原始数据logistic回归模型与各类别logistic回归模型结果,发现事故被分类后,增加了新的伤害危险因素信息,且同一个自变量值在不同的事故模式中对伤害结局的贡献不同。

    结论 对于本研究所用特定数据库潜类别分析在道路交通事故发生模式识别的结果优于传统聚类分析。中老年步行、骑行电动自行车违规横过机动车道以及青中年夜间驾驶机动车为该地区伤害高风险变量组合。

     

    Abstract: Objective To apply cluster analysis to classify road traffic crash (RTC), identify crash occurrence patterns, and provide a basis for formulating targeted intervention programs to specific RTC patterns.

    Methods The entries of traffic police conventional accident handling records in Shanghai Pudong New Area from January 1, 2010 to December 31, 2016 were retrieved, and 3 135 primary responsible persons (PRP) were identified. Nine variables including PRP's age and gender, as well as RTC occurring time, season, weather condition, road category, road section, transport means, and recorded RTC cause were selected for latent class analysis and system clustering respectively to generate cluster results and injury outcomes.

    Results For the selected dataset, latent class analysis revealed more hidden patterns than traditional system clustering method The results of latent class analysis classified the accidents into six categories of RTC occurrence patterns, namely "young and middle-aged PRP+motor vehicle+highway group" "young and middle-aged PRP+passenger vehicle+general road group" "young and middle-aged PRP+evening+motorcycle and passenger vehicle+drunk driving without license group" "middle-and old-aged PRP+electric vehicle and bicycle group" "middle-and old-aged PRP+morning and evening+pedestrian group" and "young and middle-aged PRP+late night+motor vehicle group". The differences in injury outcomes between the six categories were statistically significant (χ2=1 492.492, P < 0.05), and a correlation between injury outcomes and the categories was also identified (r=0.568, P < 0.05). The largest health impact was contributed by the "middle-and old-aged PRP+morning and evening+pedestrian group", followed by the "middle-and old-aged PRP+electric vehicle and bicycle group", and the "young and middle-aged PRP+passenger vehicle+general road group" did the least. After comparing the results of logistic regress models using original data with the results using the generated categories, new information on injury risk factors was added, and the contribution of the same independent variable to injury outcomes varied in different RTC models.

    Conclusion Latent class analysis on the specific dataset shows better performance than conventional system clustering in terms of RTC pattern recognition in current study. Middle-aged and senior pedestrians, motorbicycle riders illegally crossing roads, and young or middle-aged drivers driving motor vehicles in late night are the high-risk variable combinations for RTC-related injury in this area.

     

/

返回文章
返回