Document Type

Article

Publication Date

5-6-2026

Abstract

Computational prediction of allosteric binding sites in protein structures remains a persistent challenge, as these regulatory pockets evade detection by both sequence-based and structure-based algorithms. Both computational and physical origins of this predictive asymmetry remain insufficiently understood. In this study, we systematically examine the determinants of binding site predictability using a dual framework that integrates a fine-tuned protein language model and the structure-based method P2Rank as complementary tools probing a diverse data set of 453 human kinases, together with a physics-based interpretability layer derived from energy landscape frustration analysis. Both predictors exhibit a sharp and reproducible dichotomy on protein kinases, in which orthosteric ATP-binding sites can be identified with high precision, whereas allosteric sites are detected with substantially lower confidence across kinase structures and distinct conformational states. To decode this divergence, we deploy energy landscape-based explainable AI approach that integrates local frustration analysis as an independent physical interpretability layer, mapping predictive behavior to the underlying energetic organization of protein structures. This analysis reveals that predictive success is governed by the local energetic embedding of binding sites within the protein energy landscape. Orthosteric pockets are located in minimally frustrated basins that generate strong evolutionary and structural signatures, whereas allosteric pockets occupy predominantly neutrally frustrated zones associated with conformational plasticity and reduced evolutionary constraint. By integrating the prediction results with energy landscape analysis, our framework converts predictive performance into physically interpretable descriptors of binding site organization in protein kinases. These results establish energy landscape frustration as a potentially important determinant of algorithmic visibility and an interpretability layer providing a feasible strategy for diagnosing the limits of current prediction methods.

Comments

This article was originally published in Journal, volume number, issue number, in year. https://doi.org/

ct6c00427_si_002.pdf (1996 kB)
Supporting Information

Peer Reviewed

1

Copyright

The authors

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.