Chapter title |
Criteria to Extract High-Quality Protein Data Bank Subsets for Structure Users.
|
---|---|
Chapter number | 7 |
Book title |
Data Mining Techniques for the Life Sciences
|
Published in |
Methods in molecular biology, January 2016
|
DOI | 10.1007/978-1-4939-3572-7_7 |
Pubmed ID | |
Book ISBNs |
978-1-4939-3570-3, 978-1-4939-3572-7
|
Authors |
Oliviero Carugo, Kristina Djinović-Carugo |
Editors |
Oliviero Carugo, Frank Eisenhaber |
Abstract |
It is often necessary to build subsets of the Protein Data Bank to extract structural trends and average values. For this purpose it is mandatory that the subsets are non-redundant and of high quality. The first problem can be solved relatively easily at the sequence level or at the structural level. The second, on the contrary, needs special attention. It is not sufficient, in fact, to consider the crystallographic resolution and other feature must be taken into account: the absence of strings of residues from the electron density maps and from the files deposited in the Protein Data Bank; the B-factor values; the appropriate validation of the structural models; the quality of the electron density maps, which is not uniform; and the temperature of the diffraction experiments. More stringent criteria produce smaller subsets, which can be enlarged with more tolerant selection criteria. The incessant growth of the Protein Data Bank and especially of the number of high-resolution structures is allowing the use of more stringent selection criteria, with a consequent improvement of the quality of the subsets of the Protein Data Bank. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 10 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Bachelor | 2 | 20% |
Lecturer | 1 | 10% |
Student > Ph. D. Student | 1 | 10% |
Researcher | 1 | 10% |
Student > Postgraduate | 1 | 10% |
Other | 0 | 0% |
Unknown | 4 | 40% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 2 | 20% |
Agricultural and Biological Sciences | 1 | 10% |
Computer Science | 1 | 10% |
Medicine and Dentistry | 1 | 10% |
Chemistry | 1 | 10% |
Other | 0 | 0% |
Unknown | 4 | 40% |