Sugarcane (Saccharum spp.) is one of the most important economic crops because of its high sugar production and biofuel potential. Due to the high polyploid level and complex genome of sugarcane, it has been a huge challenge to investigate genomic sequence variations, which are critical for identifying alleles contributing to important agronomic traits. In order to mine the genetic variations in sugarcane, genotyping by sequencing (GBS), was used to genotype 14 representative Saccharum complex accessions. GBS is a method to generate a large number of markers, enabled by next generation sequencing (NGS) and the genome complexity reduction using restriction enzymes.
To use GBS for high throughput genotyping highly polyploid sugarcane, the GBS analysis pipelines in 14 Saccharum complex accessions were established by evaluating different alignment methods, sequence variants callers, and sequence depth for single nucleotide polymorphism (SNP) filtering. By using the established pipeline, a total of 76,251 non-redundant SNPs, 5642 InDels, 6380 presence/absence variants (PAVs), and 826 copy number variations (CNVs) were detected among the 14 accessions. In addition, non-reference based universal network enabled analysis kit and Stacks de novo called 34,353 and 109,043 SNPs, respectively. In the 14 accessions, the percentages of single dose SNPs ranged from 38.3% to 62.3% with an average of 49.6%, much more than the portions of multiple dosage SNPs. Concordantly called SNPs were used to evaluate the phylogenetic relationship among the 14 accessions. The results showed that the divergence time between the Erianthus genus and the Saccharum genus was more than 10 million years ago (MYA). The Saccharum species separated from their common ancestors ranging from 0.19 to 1.65 MYA.
The GBS pipelines including the reference sequences, alignment methods, sequence variant callers, and sequence depth were recommended and discussed for the Saccharum complex and other related species. A large number of sequence variations were discovered in the Saccharum complex, including SNPs, InDels, PAVs, and CNVs. Genome-wide SNPs were further used to illustrate sequence features of polyploid species and demonstrated the divergence of different species in the Saccharum complex. The results of this study showed that GBS was an effective NGS-based method to discover genomic sequence variations in highly polyploid and heterozygous species.