↓ Skip to main content

Bayesian molecular design with a chemical language model

Overview of attention for article published in Perspectives in Drug Discovery and Design, March 2017
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • Among the highest-scoring outputs from this source (#40 of 949)
  • High Attention Score compared to outputs of the same age (87th percentile)
  • High Attention Score compared to outputs of the same age and source (87th percentile)

Mentioned by

twitter
8 X users
patent
6 patents

Citations

dimensions_citation
116 Dimensions

Readers on

mendeley
214 Mendeley
Title
Bayesian molecular design with a chemical language model
Published in
Perspectives in Drug Discovery and Design, March 2017
DOI 10.1007/s10822-016-0008-z
Pubmed ID
Authors

Hisaki Ikebata, Kenta Hongo, Tetsu Isomura, Ryo Maezono, Ryo Yoshida

Abstract

The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes' law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.

X Demographics

X Demographics

The data shown below were collected from the profiles of 8 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 214 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 <1%
Germany 1 <1%
Unknown 212 99%

Demographic breakdown

Readers by professional status Count As %
Researcher 40 19%
Student > Ph. D. Student 39 18%
Other 19 9%
Student > Bachelor 19 9%
Student > Master 18 8%
Other 33 15%
Unknown 46 21%
Readers by discipline Count As %
Chemistry 43 20%
Computer Science 24 11%
Materials Science 22 10%
Engineering 19 9%
Biochemistry, Genetics and Molecular Biology 7 3%
Other 38 18%
Unknown 61 29%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 17. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 09 August 2022.
All research outputs
#2,117,989
of 25,461,852 outputs
Outputs from Perspectives in Drug Discovery and Design
#40
of 949 outputs
Outputs of similar age
#39,519
of 321,299 outputs
Outputs of similar age from Perspectives in Drug Discovery and Design
#1
of 8 outputs
Altmetric has tracked 25,461,852 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 91st percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 949 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.3. This one has done particularly well, scoring higher than 95% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 321,299 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 87% of its contemporaries.
We're also able to compare this research output to 8 others from the same source and published within six weeks on either side of this one. This one has scored higher than all of them