MS-based proteomics was applied to the analysis of the medicinal plant Artemisia annua, exploiting a recently published contig sequence database (Graham et al. (2010) Science 327, 328-331) and other genomic and proteomic sequence databases for comparison. A. annua is the predominant natural source of artemisinin, the precursor for artemisinin-based combination therapies (ACTs), which are the WHO-recommended treatment for P. falciparum malaria.
The comparison of various databases containing A. annua sequences (NCBInr/viridiplantae, UniProt/viridiplantae, UniProt/A. annua, an A. annua trichome Trinity contig database, the above contig database and another A. annua EST database) revealed significant differences in respect of their suitability for proteomic analysis, showing that an organism-specific database that has undergone extensive curation, leading to longer contig sequences, can greatly increase the number of true positive protein identifications, while reducing the number of false positives. Compared to previously published data an order-of-magnitude more proteins have been identified from trichome-enriched A. annua samples, including proteins which are known to be involved in the biosynthesis of artemisinin, as well as other highly abundant proteins, which suggest additional enzymatic processes occurring within the trichomes that are important for the biosynthesis of artemisinin.
The newly gained information allows for the possibility of an enzymatic pathway, utilizing peroxidases, for the less well understood final stages of artemisinin's biosynthesis, as an alternative to the known non-enzymatic in vitro conversion of dihydroartemisinic acid to artemisinin. Data are available via ProteomeXchange with identifier PXD000703.