Accurately predicting ligand-binding residues in proteins plays a crucial role in understanding molecular interaction mechanisms and contributes significantly to identifying potential drug targets and narrowing down drug candidates during drug discove...
Accurately predicting ligand-binding residues in proteins plays a crucial role in understanding molecular interaction mechanisms and contributes significantly to identifying potential drug targets and narrowing down drug candidates during drug discovery. Recently, machine learning-based models have been leveraged for protein-ligand binding residue prediction, offering time and cost savings compared to traditional experimental approaches. Methods for predicting protein-ligand binding residues can be roughly categorized into sequence-based and structure-based methods. Although structure-based methods have shown better performance than sequence-based methods for predicting protein residues where ligands bind. Structure-based methods rely on protein structure data, making their application to large-scale protein datasets both time- and resource-intensive. Due to this limitation, sequence-based methods have recently gained attention. However, most sequence-based methods exclude ligand information, even though ligand-binding residues in protein are determined by interactions with specific ligands. Therefore, to achieve more accurate prediction of protein-ligand binding residues, we propose a novel protein-ligand binding residue prediction model that constructs enhanced residue embeddings by combining a protein language model with 1D-CNN and BiLSTM, and integrates this with atom-level ligand embedding. The proposed model outperformed existing methods across all evaluation metrics. Futhermore, the results of the visualization experiments demonstrated close alignment between the predicted and actual binding residues. This highlights the model's reliability in accurately identifying ligand-binding residues.