Title |
Community benchmarks for virtual screening
|
---|---|
Published in |
Perspectives in Drug Discovery and Design, February 2008
|
DOI | 10.1007/s10822-008-9189-4 |
Pubmed ID | |
Authors |
John J. Irwin |
Abstract |
Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands physically, so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands physically but not topologically to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chemical space, and the proper scope for using DUD. Careful attention to both the composition of benchmarks and how they are used is essential to avoid being misled by overfitting and bias. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Germany | 4 | 2% |
Brazil | 4 | 2% |
Spain | 3 | 1% |
France | 2 | <1% |
United States | 2 | <1% |
China | 2 | <1% |
Czechia | 1 | <1% |
Singapore | 1 | <1% |
India | 1 | <1% |
Other | 4 | 2% |
Unknown | 196 | 89% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 45 | 20% |
Student > Ph. D. Student | 40 | 18% |
Student > Master | 30 | 14% |
Student > Bachelor | 21 | 10% |
Professor > Associate Professor | 14 | 6% |
Other | 43 | 20% |
Unknown | 27 | 12% |
Readers by discipline | Count | As % |
---|---|---|
Chemistry | 82 | 37% |
Agricultural and Biological Sciences | 34 | 15% |
Computer Science | 18 | 8% |
Medicine and Dentistry | 15 | 7% |
Biochemistry, Genetics and Molecular Biology | 12 | 5% |
Other | 23 | 10% |
Unknown | 36 | 16% |