Based on dual-representation fusion and interpretable modeling, providing efficient and accurate drug-target interaction prediction services to accelerate drug discovery
Drug-Target Interaction (DTI) prediction is a core link in drug discovery. By using computational methods to predict the binding possibility between small-molecule drugs and biological targets (such as proteins), it can significantly reduce experimental costs and shorten the research and development cycle.
The ProstEDTI system innovatively integrates multi-dimensional features with advanced machine learning algorithms to construct a prediction model with both high accuracy and interpretability. The system can not only output reliable interaction prediction results but also reveal the key biological mechanisms behind the predictions, providing in-depth references for drug design and optimization.
Four core technical modules work synergistically to achieve a dual breakthrough in prediction accuracy and interpretability
Integrates 64-dimensional molecular embeddings from Mol2vec and 1024-dimensional protein embeddings from ProstT5, comprehensively capturing drug molecular structure features and target sequence information to lay the foundation for accurate prediction.
Eliminates noisy samples through Neighborhood Consistency Sampling (ENN) and selects key dimensions based on SHAP feature contribution, effectively improving model generalization ability and computational efficiency.
Constructs an XGBoost-LightGBM ensemble model, which combines the ability to capture non-linear feature interactions with efficient computing performance, balancing prediction accuracy and running speed.
Innovatively integrates LIME interpretation technology to achieve visual presentation from global feature importance to local decision logic, making prediction results evidence-based.