Date of Award

Spring 5-29-2019

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computational and Data Sciences

First Advisor

Hagop Atamian, PhD

Second Advisor

Cyril Rakovski, PhD

Third Advisor

Gennady Verkhivker, PhD

Abstract

Salvia hispanica L. (commonly known as chia) is gaining popularity worldwide and specially in US as a healthy oil and food supplement for human and animal consumption due to its favorable oil composition, and high protein, fiber, and antioxidant contents. Despite these benefits and its growing public demand, very limited gene sequence information is currently available in public databases. In this project, we generated 90 million high quality 150 bp paired-end sequences from the chia leaf and root tissues. The sequences were de novo assembled into 103,367 contigs with average length of 1,445 bp. The resulted assembly represented 92.2% transcriptome completeness. Around 69% of the assembled contigs were annotated against the uniprot database and represented a diverse array of functional and biological categories. A total of 14,267 contigs showed significant expression difference between the leaf and root tissues, with 6,151 and 8,116 contigs upregulated in the leaf and root, respectively. The sequence data generated in this project will provide valuable resources for future functional genomic research in chia. With the availability of transcriptome sequences, it would be possible to identify genes involved in the important metabolic pathways that give chia its unique nutritional and medicinal properties. Finally, the generated data will contribute to the genetic improvement efforts of chia to better serve the public demand.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Available for download on Monday, June 01, 2020

Share

COinS