Genome sequencing of clinically important strains of Toxoplasma gondii
Toxoplasma gondii is a wide spread protozoan parasite of animals and an important opportunistic pathogen in humans, causing disease in congenitally infected infants and in immunocompromised individuals. T. gondii is transmitted by cats, the definitive host, and infects a wide range of intermediate hosts. Humans are not a natural part of the life cycle but they become infected by ingestion of tissue cysts in under cooked meat, or oocysts that are shed by cats and which can contaminate water and food. The potential for food and water borne infections puts T. gondii as a category B Biodefense Agent by NIAID. New research on T. gondii has revealed that there is more genetic variation than previously expected in lineages from North America and Europe. The population structure is strongly subdivided by geographic region and by the existence of clonal lineages in some regions. Moreover, sampling from South America showed that strains from this continent are highly divergent and possibly comprising new groups with both clonal and non-clonal genotypes.
There are currently 12 major lineages of T. gondii grouping more than 900 strains but, to date, there are only three strains from three lineages that have undergone genome sequencing, reflecting a limited genetic and biological diversity. This project aims to generate sequence for 47 T. gondii strains including nine prototypic strains from the 12 major lineages in order to complete whole genome sequencing of the lineages that are presently known; as well as to re-sequence the ME49 reference strain. For comparative purposes, we will obtain genome sequences of closely related protozoan parasites outside the genus Toxoplasma, including Gregarina niphandrodes and Hammondia hammondi. In addition to genome sequencing using 454 and Illumina, end sequencing of 10,000 cosmid clones will be performed using Sanger technology for ME49-derived cosmids to develop the cosmid clone resource for community use.
The proposed studies will provide whole genome sequences for members of these additional genetic lineages, thus defining their gene content, profiling differences in expression, and expanding our understanding of genetic diversity. A combination of high-coverage sequencing for designated prototypic strains combined with moderate coverage for additional isolates from each group will support further studies on genetic diversity within and between lineages. Collectively, these data will allow comparison of genetic composition, chromosome organization, gene content, gene expression, synteny, diversity, and estimates of ancestry within and between the major lineages.
The initial white paper submitted can be downloaded here. Since white papers are not always approved exactly as submitted, this document may not exactly describe the final form of the project. Please contact email@example.com if you have any question.
Investigators and Collaborators
Sibley, David L.
Professor Washington University School of Medicine
Assistant Professor J. Craig Venter Institute