Title |
Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time
|
---|---|
Published in |
Algorithms for Molecular Biology, July 2023
|
DOI | 10.1186/s13015-023-00227-1 |
Pubmed ID | |
Authors |
Sebastian Schmidt, Jarno N. Alanko |
Abstract |
A fundamental operation in computational genomics is to reduce the input sequences to their constituent k-mers. For maximum performance of downstream applications it is important to store the k-mers in small space, while keeping the representation easy and efficient to use (i.e. without k-mer repetitions and in plain text). Recently, heuristics were presented to compute a near-minimum such representation. We present an algorithm to compute a minimum representation in optimal (linear) time and use it to evaluate the existing heuristics. Our algorithm first constructs the de Bruijn graph in linear time and then uses a Eulerian-cycle-based algorithm to compute the minimum representation, in time linear in the size of the output. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
France | 1 | 50% |
Unknown | 1 | 50% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 1 | 50% |
Members of the public | 1 | 50% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 4 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 2 | 50% |
Student > Bachelor | 1 | 25% |
Other | 1 | 25% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 3 | 75% |
Chemical Engineering | 1 | 25% |