Evolution of Host Specificity and Virulence
The type III secretion system is a specialized protein injection apparatus found in both plant and animal bacterial pathogens. The effectors that traverse this system are particularly interesting from both an evolutionary as well as functional perspective since they play central roles in determining host-specificity and the fate of host-pathogen interactions. These effectors are injected directly into the host cytoplasm where they target and destabilize host cellular processes, compromise host signal transduction, and play a particularly important role in suppressing the host immune system. These activities promote pathogen growth and transmission, ultimately resulting in disease.
We are studying the evolution and function of type III secreted effector proteins in the plant pathogen Pseudomonas syringae and the human pathogen Pseudomonas aeruginosa. We use screens and comparative genomics to identify new effectors, and functional and molecular methods to investigate the specific role of these effectors play in the infection process. Overlaying all of this work is an evolutionary framework that focuses on the selective significance of natural genetic variation in type III effectors. These evolutionary approaches permit us to identify the targets and mechanism of natural selection, the role and significance of recombination, the power of gene flow, and the constraints imposed by history. These studies are also perhaps the most powerful means to identify candidates for functional and mechanistic studies.
The sequencing of genomes from closely related bacterial strains has revealed a remarkable and unexpected pattern. Strains from the same species often differ by more than half of their genomic content. This variable component of a species’ genome has been called the flexible genome, in contrast to the core genome, which is that conserved component found in all strains of a species. While the core genome contains the so-called housekeeping genes, the flexible genome largely determines its ecological niche. An understanding of the ecological and evolutionary potential of a strain requires an understanding of its flexible genome. This of course requires the sequencing and comparison of the whole genomes from a large collection of strains – data which was unimaginable up until very recently.
We are addressing this problem by taking advantage of the massive data generation capabilities of the next-generation Illumina sequencing platforms currently available at the University of Toronto’s Centre for the Analysis of Genome Evolution and Function (CAGEF). The Illumina MiSeq produces much as 15 billion bases of sequence from each run, enough data to sequence multiple Pseudomonas strains in one run. We are using this new technology to understand how genomic variability among strains influences their ability to interact with their environment and hosts.
Evolution and Ecology of the Pseudomonads
The Pseudomonads are arguably one of the most fascinating and ecologically significant bacterial genera. This group contains strains of intense medical, agricultural, and biotechnological interest. P. syringae and P. aeruginosa, for example, are both large and diverse species complexes that infect a wide variety of hosts, and cause a tremendous range of diseases. We are using Multilocus Sequence Typing (MLST) to systematically study the clonal relationships among strains within this genus. MLST is a rapid strain typing method in which the nucleotide sequence of housekeeping loci are used to assess the genetic variation that accumulates as strains diverge from a common ancestor. The use of housekeeping genes focuses on the ‘core genome’ – those loci that are required by all related organisms and consequently less likely to have been horizontal exchanged or be under strong or unusual selective pressures. This work provides an excellent comparative foundation for identifying and studying the genetic factors that underlie evolutionary adaptation and ecological specialization.
We are using metagenomics to characterize the composition, structure, dynamics, and function of bacterial communities found in such diverse habitats as the human lung to arctic soils. Metagenomics is the study of communities of organisms via genomic material taken directly from their natural environment. Metagenomics’ power comes from its ability to assess community structure and dynamics without needing to propagate organisms in the laboratory. Since it is entirely free of the constraints and limitations imposed by culture-based techniques, it provides a largely unbiased and semi-quantitative assessment of community diversity and activity. For this work we use both CAGEF’s Illumina MiSeq, and their new Illumina NextSeq, which can produce a staggering 120 billion bases per run, allowing entire communities to be studied without any prior knowledge and without the need for specific selective conditions. This is done by isolating and characterizing DNA or RNA directly from the metagenome (the combined genomes or transcriptomes of all species present in the community of interest), thereby bypassing the need for laboratory isolation and cultivation.