Report for: Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia

Title	Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
Published in	BMC Bioinformatics, June 2016
DOI	10.1186/s12859-016-1044-3
Pubmed ID	27295298
Authors	Ioannis Kavakiotis, Aliki Xochelli, Andreas Agathangelidis, Grigorios Tsoumakas, Nicos Maglaveras, Kostas Stamatopoulos, Anastasia Hadzidimitriou, Ioannis Vlahavas, Ioanna Chouvarda
Abstract	Somatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostication markers for clinical outcome in chronic lymphocytic leukaemia (CLL), the most frequent adult B-cell malignancy. In this paper we present a methodology for integrating multiple immunogenetic and clinocobiological data sources in order to extract features and create high quality datasets for SHM analysis in IG receptors of CLL patients. This dataset is used as the basis for a higher level integration procedure, inspired form social choice theory. This is applied in the Towards Analysis, our attempt to investigate the potential ontogenetic transformation of genes belonging to specific stereotyped CLL subsets towards other genes or gene families, through SHM. The data integration process, followed by feature extraction, resulted in the generation of a dataset containing information about mutations occurring through SHM. The Towards analysis performed on the integrated dataset applying voting techniques, revealed the distinct behaviour of subset #201 compared to other subsets, as regards SHM related movements among gene clans, both in allele-conserved and non-conserved gene areas. With respect to movement between genes, a high percentage movement towards pseudo genes was found in all CLL subsets. This data integration and feature extraction process can set the basis for exploratory analysis or a fully automated computational data mining approach on many as yet unanswered, clinically relevant biological questions.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
Unknown	1	100%

Demographic breakdown

Type	Count	As %
Members of the public	1	100%

Mendeley readers

The data shown below were compiled from readership statistics for 17 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Unknown	17	100%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	4	24%
Researcher	3	18%
Lecturer > Senior Lecturer	2	12%
Professor	2	12%
Librarian	2	12%
Other	2	12%
Unknown	2	12%

Readers by discipline	Count	As %
Medicine and Dentistry	7	41%
Biochemistry, Genetics and Molecular Biology	5	29%
Agricultural and Biological Sciences	2	12%
Computer Science	1	6%
Unknown	2	12%

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 14 June 2016.

All research outputs

#20,333,181

of 22,877,793 outputs

Outputs from BMC Bioinformatics

#6,872

of 7,298 outputs

Outputs of similar age

#293,313

of 340,767 outputs

Outputs of similar age from BMC Bioinformatics

#84

of 90 outputs

Altmetric has tracked 22,877,793 research outputs across all sources so far. This one is in the 1st percentile – i.e., 1% of other outputs scored the same or lower than it.

So far Altmetric has tracked 7,298 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 1st percentile – i.e., 1% of its peers scored the same or lower than it.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 340,767 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.

We're also able to compare this research output to 90 others from the same source and published within six weeks on either side of this one. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.

Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context