Document Type


Publication Date



Suicidal and self-injurious incidents in correctional settings deplete the institutional and healthcare resources, create disorder and stress for staff and other inmates. Traditional statistical analyses provide some guidance, but they can only be applied to structured data that are often difficult to collect and their recommendations are often expensive to act upon. This study aims to extract information from medical and mental health progress notes using AI algorithms to make actionable predictions of suicidal and self-injurious events to improve the efficiency of triage for health care services and prevent suicidal and injurious events from happening at California's Orange County Jails. The results showed that the notes data contain more information with respect to suicidal or injurious behaviors than the structured data available in the EHR database at the Orange County Jails. Using the notes data alone (under-sampled to 50%) in a Transformer Encoder model produced an AUC-ROC of 0.862, a Sensitivity of 0.816, and a Specificity of 0.738. Incorporating the information extracted from the notes data into traditional Machine Learning models as a feature alongside structured data (under-sampled to 50%) yielded better performance in terms of Sensitivity (AUC-ROC: 0.77, Sensitivity: 0.89, Specificity: 0.65). In addition, under-sampling is an effective approach to mitigating the impact of the extremely imbalanced classes.


This article was originally published in Journal of Psychiatric Research, volume 160, in 2023.

Peer Reviewed



The authors

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.