Evaluation of database balancing techniques for road accident severity classification employing Artificial Neural Network

Authors

DOI:

https://doi.org/10.14295/transportes.v28i5.2271

Keywords:

Imbalanced data. Accident severity. Classification and Artificial Neural Networks.

Abstract

An inherent feature of road accident databases is the imbalance between the number of observations associated with accidents with fatal and non-fatal victims of injuries concerning to accidents without victims. This particularity led to the adoption of corresponding balancing techniques, which can resample classes and attributes. Therefore, it ensures that there is no over-adjustment of the data in classification problems. This study investigates the influence of different balancing methods such as undersampling, oversampling and SMOTE on the classification process of road accident severity adopting an Artificial Neural Network approach. The results obtained indicate that all methods used were able to effectively adjust the balance between the minority and majority classes. Balancing leads to a better performance of the classifier, shown by the efficient adjustment of the data to the model, as the gain in the quality and accuracy of the classification process, especially, considering sampling techniques such as SMOTE.

Downloads

Download data is not yet available.

Author Biographies

Maria Lígia Chuerubim, Faculty of Civil Engineering, Federal University of Uberlândia, Brazil.

She holds a degree in Cartographic Engineering from Universidade Estadual Paulista Júlio de Mesquita Filho (2006). Master's degree with the Postgraduate Program in Cartographic Sciences of UNESP (2007-2009). She is a Professor at the Faculty of Civil Engineering (FECIV) at the Federal University of Uberlândia (UFU), where she works in the field of Geosciences with emphasis on Geodesy and Geoprocessing applied to Urban Planning, Environmental and Transport Technology. Currently, she is a PhD student at the São Carlos Engineering School (EESC) of the University of São Paulo (USP), developing her thesis in Geoprocessing techniques and road safety analysis.

Leonardo N. Ferreira, Associate Laboratory of Computation and Applied Mathematics National Institute of Space Research (INPE), Brazil

I am a postdoctoral researcher at the National Institute for Space Research (INPE) from Brazil and a visiting scholar at the Humboldt-Universität zu Berlin and the Potsdam Institute for Climate Impact Research (PIK). I received MSc and Ph.D. degrees in computer science and computational mathematics from the University of São Paulo (USP). During my Ph.D., I worked as a visiting predoctoral fellow at Northwestern University (USA). My current research involves the analysis of global and regional wildfire using pattern recognition tools like machine learning and complex networks theory. This research is inserted in an international collaborative project entitled Dynamical Phenomena in Complex Networks.

Alan D.B. Valejo, Institute of Mathematical and Computer Sciences, School of Engineering of São Carlos, University of São Paulo, Brazil.

He is a PhD student at the Institute of Mathematical and Computer Sciences of the University of São Paulo (ICMC-USP), São Carlos. He holds a degree in computer science with an emphasis on Web Applications Engineering at ICMC-USP. He obtained a Master's degree in Computer Science and Computational Mathematics at ICMC-USP in 2014. He researches in the field of Artificial Intelligence and Machine Learning and is interested in issues related to analysis and mining of complex networks.

Bárbara Stolte Bezerra, Faculty of Civil Engineering, UNESP Sao Paulo State University, Brazil.

She holds a degree in Civil Engineering at Universidade de São Paulo (1995), Master's at Architecture and Urbanism from Universidade Federal da Bahia (2001) and PhD in Transport Engineering from Universidade de São Paulo (2007). Researches areas include the following subjects: traffic engineering, road safety analysis and modeling.

Giuliano Sant'Anna Marotta, Faculty of Civil Engineering, Federal University of Uberlândia, Brazil.

Bachelor's at Surveying Engineering from Universidade Federal de Viçosa (2005), master's at Civil Engineering (Spatial Information) from Universidade Federal de Viçosa (2008) and doctorate at Geosciences (Geophysics) from Universidade de Brasília (2013). Has experience in Geosciences, focusing on Geodesy and Geophysics.

Irineu da Silva, Department of Transport Engineering, School of Engineering of São Carlos, University of São Paulo, Brazil.

He holds a degree in Civil Engineering from the University of São Paulo (1980), a Master's degree in Transport Engineering from the University of São Paulo (1986) and a PhD in Geosciences from the Ecole Polytechnique Fédérale de Lausanne (1990). He is currently Prof. Associate of the University of São Paulo. Has experience in the area of Geosciences, with emphasis on Quality Control in Geodetic Measurements and Structural Monitoring.

References

Alejo, R.; Valdovinos, R. M. García, V. e J. H. Pacheco-Sanchez (2013) A hybrid method to face class overlap and class imbalance on neural networks and multi-class scenarios. Pattern Recognition Letters, v. 34, n. 4, p. 380–388. DOI: 10.1016/j.patrec.2012.09.003

Bolón-Canedo, V.; Sánchez-Maroño, N.; Alonso-Betanzos, A.; Benítez, J. M. e F. Herrera (2014) A review of microarray datasets and applied feature selection methods. Information Sciences, v. 282, p. 111–135. DOI: 10.1016/j.ins.2014.05.042

Chang, L-Y (2005) Analysis of freeway accident frequencies: Negative binomial regression versus artificial neural network. Safety Science, v. 43, p. 541-557. DOI: 10.1016/j.ssci.2005.04.004

Chang, L. e H. Wang (2006) Analysis of traffic injury severity: An application of non-parametric classification tree techniques. Accident Analysis & Prevention, v. 38, p. 1019–1027. DOI: 10.1016/j.aap.2006.04.009

Chen, C.; Zhang, G.; Qian, Z.; Tarefder, R. A. e Z. Tian (2016) Investigating driver injury severity patterns in rollover crashes using support vector machine models. Accident Analysis & Prevention, v. 90, p. 128–139. DOI: 10.1016/j.aap.2016.02.011

Delen, D.; Sharda, R. e M. Bessonov (2006) Identifying significant predictors of injury severity in traffic accidents using a series of artificial neural networks. Accident Analysis & Prevention, v. 38, p. 434–444. DOI: 10.1016/j.aap.2005.06.024

Facelli, K.; Lorena, A. C.; Gama, J. e A. C. P. L. F, Carvalho (2011). Inteligência Artificial: Uma abordagem de aprendizado de máqui-na. Rio de Janeiro: LTC. 378p.

Fawcett, T. (2016) Learning from Imbalanced Classes. Available in: https://www.svds.com/learning-imbalanced-classes/. Access: November/2018.

Fouladgar, M.; Parchami, M.; Elmasri, R. e A. Ghaderi (2017) Scalable Deep Traffic Flow Neural Networks for Urban Traffic Congestion Prediction. International Joint Conference on Neural Networks (IJCNN), p. 2251–2258. DOI: 10.1109/IJCNN.2017.7966128

Hosmer, D.W. e S. Lemeshow (2000) Applied logistic regression, 2nd Ed. John Wiley & Sons, New York.

Krawczyk, B. (2016) Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, v. 5, n. 4, p. 221–232. DOI: 10.1007/s13748-016-0094-0

Li, J.; Fong, S.; Wong, R. K.; Mohammed, S.; Fiaidhi, J. e Y. Sung (2018) A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets. Applied Soft Computing Journal, p. 1–22. DOI: 10.1016/j.asoc.2017.11.028

Mussone, L.; Ferrari, A. e M. Oneta (1999) An analysis of urban collisions using an artificial intelligence model. Accident Analysis & Prevention, 31, v. 31, p. 705–718. DOI: 10.1016/S0001-4575(99)00031-7

Prati, R. C.; Batista, G. E. A. P. A. e M. C. Monard (2008) Curvas ROC para avaliação de classificadores [Internet]. IEEE Latin America Transactions. 2008; 6 (2): 215-222.Available from: http://ieeexplore.ieee.org/stamp/stamp.do?arnumber=4609920&isnumber=4609907

Salunkhe, U. R. e S. N. Mali (2016) Classifier Ensemble Design for Imbalanced Data Classification: A Hybrid Approach. Interna-tional Conference on Computational Modeling and Security (CMS 2016), v. 85, n. Cms, p. 725–732. DOI: 10.1016/j.procs.2016.05.259

Wang, C.; Qiu, C.; Zuo, X. e C. Liu (2014) An Accident Severity Classification Model Based on Multi-Objective Particle Swarm Optimization. IEICE Trans. Inf. & Syst., n. 11, p. 2863–2871 DOI: 10.1587/transinf.2014EDP7069

Yuan, J., Abdel-Aty, M., Gong, Y. e Q. Cai (2019). Real-time crash risk prediction using long short-term memory recurrent neu-ral network. Transportation research record, 2673(4), 314-326. DOI: 10.1177/0361198119840611

Downloads

Published

2020-12-15

How to Cite

Chuerubim, M. L., Ferreira, L. N., Valejo, A. D., Bezerra, B. S., Marotta, G. S., & da Silva, I. (2020). Evaluation of database balancing techniques for road accident severity classification employing Artificial Neural Network. TRANSPORTES, 28(5), 252–266. https://doi.org/10.14295/transportes.v28i5.2271

Issue

Section

Artigos