↓ Skip to main content

An ensemble micro neural network approach for elucidating interactions between zinc finger proteins and their target DNA

Overview of attention for article published in BMC Genomics, December 2016
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (70th percentile)
  • Good Attention Score compared to outputs of the same age and source (66th percentile)

Mentioned by

twitter
1 X user
patent
1 patent

Citations

dimensions_citation
9 Dimensions

Readers on

mendeley
18 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
An ensemble micro neural network approach for elucidating interactions between zinc finger proteins and their target DNA
Published in
BMC Genomics, December 2016
DOI 10.1186/s12864-016-3323-9
Pubmed ID
Authors

Shayoni Dutta, Spandan Madan, Harsh Parikh, Durai Sundar

Abstract

The ability to engineer zinc finger proteins binding to a DNA sequence of choice is essential for targeted genome editing to be possible. Experimental techniques and molecular docking have been successful in predicting protein-DNA interactions, however, they are highly time and resource intensive. Here, we present a novel algorithm designed for high throughput prediction of optimal zinc finger protein for 9 bp DNA sequences of choice. In accordance with the principles of information theory, a subset identified by using K-means clustering was used as a representative for the space of all possible 9 bp DNA sequences. The modeling and simulation results assuming synergistic mode of binding obtained from this subset were used to train an ensemble micro neural network. Synergistic mode of binding is the closest to the DNA-protein binding seen in nature, and gives much higher quality predictions, while the time and resources increase exponentially in the trade off. Our algorithm is inspired from an ensemble machine learning approach, and incorporates the predictions made by 100 parallel neural networks, each with a different hidden layer architecture designed to pick up different features from the training dataset to predict optimal zinc finger proteins for any 9 bp target DNA. The model gave an accuracy of an average 83% sequence identity for the testing dataset. The BLAST e-value are well within the statistical confidence interval of E-05 for 100% of the testing samples. The geometric mean and median value for the BLAST e-values were found to be 1.70E-12 and 7.00E-12 respectively. For final validation of approach, we compared our predictions against optimal ZFPs reported in literature for a set of experimentally studied DNA sequences. The accuracy, as measured by the average string identity between our predictions and the optimal zinc finger protein reported in literature for a 9 bp DNA target was found to be as high as 81% for DNA targets with a consensus sequence GCNGNNGCN reported in literature. Moreover, the average string identity of our predictions for a catalogue of over 100 9 bp DNA for which the optimal zinc finger protein has been reported in literature was found to be 71%. Validation with experimental data shows that our tool is capable of domain adaptation and thus scales well to datasets other than the training set with high accuracy. As synergistic binding comes the closest to the ideal mode of binding, our algorithm predicts biologically relevant results in sync with the experimental data present in the literature. While there have been disjointed attempts to approach this problem synergistically reported in literature, there is no work covering the whole sample space. Our algorithm allows designing zinc finger proteins for DNA targets of the user's choice, opening up new frontiers in the field of targeted genome editing. This algorithm is also available as an easy to use web server, ZifNN, at http://web.iitd.ac.in/~sundar/ZifNN/ .

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 18 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 18 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 5 28%
Student > Bachelor 4 22%
Student > Ph. D. Student 3 17%
Other 2 11%
Librarian 2 11%
Other 0 0%
Unknown 2 11%
Readers by discipline Count As %
Agricultural and Biological Sciences 3 17%
Medicine and Dentistry 2 11%
Nursing and Health Professions 1 6%
Physics and Astronomy 1 6%
Computer Science 1 6%
Other 2 11%
Unknown 8 44%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 08 June 2023.
All research outputs
#6,972,253
of 24,293,076 outputs
Outputs from BMC Genomics
#3,014
of 10,943 outputs
Outputs of similar age
#124,086
of 428,882 outputs
Outputs of similar age from BMC Genomics
#75
of 232 outputs
Altmetric has tracked 24,293,076 research outputs across all sources so far. This one has received more attention than most of these and is in the 69th percentile.
So far Altmetric has tracked 10,943 research outputs from this source. They receive a mean Attention Score of 4.8. This one has gotten more attention than average, scoring higher than 71% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 428,882 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 70% of its contemporaries.
We're also able to compare this research output to 232 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 66% of its contemporaries.