Journal of Loss Prevention in The Process Industries, Vol.57, 47-54, 2019
Narrative texts-based anomaly detection using accident report documents: The case of chemical process safety
For detecting anomalous conditions of accidents, previous studies have usually used numeric data such as physical signals and conditions. However, invaluable text information contained in accident report documents such as accident description and situation has yet to be used to analyze the anomaly conditions. In this respect, this study aims to propose the text mining-based local outlier factor (LOF) algorithm approach to detecting anomalous conditions using accident report documents, focusing on the text information. In this study, anomalous conditions are defined as the unexperienced accidents that occur in unusual conditions. The unusual conditions are identified in terms of qualitative variables of the accident narrative texts such as locations, processes, and work types. The text mining algorithm is applied to systematically investigate these unusual contexts through the text contents contained in accident report documents and the LOF algorithm is used to identify anomaly accidents in terms of local density clusters. The LOF algorithm is recognized as one of the anomaly detection algorithms to identify the outliers among data clusters based on the density-based clustering. As a result, four major types of anomaly accidents in chemical process are derived: filling-related, detection-related, ventilation-related, and waste-related accidents. Also, risk keywords of the anomaly accidents in each type are extracted and compared with the keywords of the normal accidents to understand the detailed anomalous conditions. By extracting and prioritizing the anomaly conditions based on text information, not based on numeric value, the proposed approach enables safety managers to monitor the natural language-based risk factors and reasons of infrequent, anomalous, and critical accidents.
Keywords:Narrative texts;Anomaly detection;Accident documents;Local outlier factor (LOF);Text mining;Process safety