Protein-Ligand Empirical Interaction Components for Virtual Screening

Yuna Yan, Weijun Wang, Zhaoxi Sun, John Zhang, Changge Ji

Research output: Contribution to journalArticle

Abstract

A major shortcoming of empirical scoring functions is that they often fail to predict binding affinity properly. Removing false positives of docking results is one of the most challenging works in structure-based virtual screening. Postdocking filters, making use of all kinds of experimental structure and activity information, may help in solving the issue. We describe a new method based on detailed protein-ligand interaction decomposition and machine learning. Protein-ligand empirical interaction components (PLEIC) are used as descriptors for support vector machine learning to develop a classification model (PLEIC-SVM) to discriminate false positives from true positives. Experimentally derived activity information is used for model training. An extensive benchmark study on 36 diverse data sets from the DUD-E database has been performed to evaluate the performance of the new method. The results show that the new method performs much better than standard empirical scoring functions in structure-based virtual screening. The trained PLEIC-SVM model is able to capture important interaction patterns between ligand and protein residues for one specific target, which is helpful in discarding false positives in postdocking filtering.

Original languageEnglish (US)
Pages (from-to)1793-1806
Number of pages14
JournalJournal of Chemical Information and Modeling
Volume57
Issue number8
DOIs
StatePublished - Aug 28 2017

Fingerprint

Screening
Ligands
Proteins
interaction
Learning systems
interaction pattern
learning
Support vector machines
Decomposition
performance

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications
  • Library and Information Sciences

Cite this

Protein-Ligand Empirical Interaction Components for Virtual Screening. / Yan, Yuna; Wang, Weijun; Sun, Zhaoxi; Zhang, John; Ji, Changge.

In: Journal of Chemical Information and Modeling, Vol. 57, No. 8, 28.08.2017, p. 1793-1806.

Research output: Contribution to journalArticle

Yan, Yuna ; Wang, Weijun ; Sun, Zhaoxi ; Zhang, John ; Ji, Changge. / Protein-Ligand Empirical Interaction Components for Virtual Screening. In: Journal of Chemical Information and Modeling. 2017 ; Vol. 57, No. 8. pp. 1793-1806.
@article{60407e4b8f414d608dbbd5eeb5c4b3ac,
title = "Protein-Ligand Empirical Interaction Components for Virtual Screening",
abstract = "A major shortcoming of empirical scoring functions is that they often fail to predict binding affinity properly. Removing false positives of docking results is one of the most challenging works in structure-based virtual screening. Postdocking filters, making use of all kinds of experimental structure and activity information, may help in solving the issue. We describe a new method based on detailed protein-ligand interaction decomposition and machine learning. Protein-ligand empirical interaction components (PLEIC) are used as descriptors for support vector machine learning to develop a classification model (PLEIC-SVM) to discriminate false positives from true positives. Experimentally derived activity information is used for model training. An extensive benchmark study on 36 diverse data sets from the DUD-E database has been performed to evaluate the performance of the new method. The results show that the new method performs much better than standard empirical scoring functions in structure-based virtual screening. The trained PLEIC-SVM model is able to capture important interaction patterns between ligand and protein residues for one specific target, which is helpful in discarding false positives in postdocking filtering.",
author = "Yuna Yan and Weijun Wang and Zhaoxi Sun and John Zhang and Changge Ji",
year = "2017",
month = "8",
day = "28",
doi = "10.1021/acs.jcim.7b00017",
language = "English (US)",
volume = "57",
pages = "1793--1806",
journal = "Journal of Chemical Information and Modeling",
issn = "1549-9596",
publisher = "American Chemical Society",
number = "8",

}

TY - JOUR

T1 - Protein-Ligand Empirical Interaction Components for Virtual Screening

AU - Yan, Yuna

AU - Wang, Weijun

AU - Sun, Zhaoxi

AU - Zhang, John

AU - Ji, Changge

PY - 2017/8/28

Y1 - 2017/8/28

N2 - A major shortcoming of empirical scoring functions is that they often fail to predict binding affinity properly. Removing false positives of docking results is one of the most challenging works in structure-based virtual screening. Postdocking filters, making use of all kinds of experimental structure and activity information, may help in solving the issue. We describe a new method based on detailed protein-ligand interaction decomposition and machine learning. Protein-ligand empirical interaction components (PLEIC) are used as descriptors for support vector machine learning to develop a classification model (PLEIC-SVM) to discriminate false positives from true positives. Experimentally derived activity information is used for model training. An extensive benchmark study on 36 diverse data sets from the DUD-E database has been performed to evaluate the performance of the new method. The results show that the new method performs much better than standard empirical scoring functions in structure-based virtual screening. The trained PLEIC-SVM model is able to capture important interaction patterns between ligand and protein residues for one specific target, which is helpful in discarding false positives in postdocking filtering.

AB - A major shortcoming of empirical scoring functions is that they often fail to predict binding affinity properly. Removing false positives of docking results is one of the most challenging works in structure-based virtual screening. Postdocking filters, making use of all kinds of experimental structure and activity information, may help in solving the issue. We describe a new method based on detailed protein-ligand interaction decomposition and machine learning. Protein-ligand empirical interaction components (PLEIC) are used as descriptors for support vector machine learning to develop a classification model (PLEIC-SVM) to discriminate false positives from true positives. Experimentally derived activity information is used for model training. An extensive benchmark study on 36 diverse data sets from the DUD-E database has been performed to evaluate the performance of the new method. The results show that the new method performs much better than standard empirical scoring functions in structure-based virtual screening. The trained PLEIC-SVM model is able to capture important interaction patterns between ligand and protein residues for one specific target, which is helpful in discarding false positives in postdocking filtering.

UR - http://www.scopus.com/inward/record.url?scp=85028543259&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028543259&partnerID=8YFLogxK

U2 - 10.1021/acs.jcim.7b00017

DO - 10.1021/acs.jcim.7b00017

M3 - Article

VL - 57

SP - 1793

EP - 1806

JO - Journal of Chemical Information and Modeling

JF - Journal of Chemical Information and Modeling

SN - 1549-9596

IS - 8

ER -