Authors: Santosh Kumar Niraj Shah Vanika Garg Sabhyata Bhatia
Publish Date: 2014/02/01
Volume: 33, Issue: 6, Pages: 905-918
Abstract
Nextgeneration sequencing is an efficient system for generating highthroughput complete transcripts/genes and developing molecular markers We present here the transcriptome sequencing of a 26dayold Catharanthus roseus seedling tissue using Illumina GAIIX platform that resulted in a total of 337 Gb of nucleotide sequence data comprising 29964104 reads which were de novo assembled into 26581 unigenes Based on similarity searches 58 of the unigenes were annotated of which 13580 unique transcripts were assigned 5016 gene ontology terms Further 7687 of the unigenes were found to have Cluster of Orthologous Group classifications and 4006 were assigned to 289 Kyoto Encyclopedia of Genes and Genome pathways Also 5221 1964 of transcripts were distributed to 81 known transcription factor TF families Insilico analysis of the transcriptome resulted in identification of 11004 SSRs in 2662 transcripts from which 2520 SSR markers were designed which exhibited a nonrandom pattern of distribution The most abundant was the trinucleotide repeats AAG/CTT followed by the dinucleotide repeats AG/CT Location specific analysis of SSRs revealed that SSRs were preferentially associated with the 5′UTRs with a predicted role in regulation of gene expression A PCR validation of a set of 48 primers revealed 979 successful amplification and 766 of them showed polymorphism across different Catharanthus species as well as accessions of C roseus In summary this study will provide an insight into understanding the seedling development and resources for novel gene discovery and SSR development for utilization in markerassisted selective breeding in C roseus
Keywords: