Friday, January 23, 2009

Plasmid gene content analysis

Plasmid gene content analysis

Shared gene content patterns, also called phylogenetic profiles, have been used to build phylogenetic trees (Snel et al. 1999), to predict protein function (Pellegrini et al. 1999), and to reconstruct gene content of ancestral species (Kunin et al. 2003) for prokaryotic genomes. The phylogenetic reconstruction based on gene content is useful particularly for mobile genetic elements such as phages and plasmids where universally shared homologous sequences, a prerequisite for phylogenetic analyses, are not always available. Recently, the gene content analysis has been applied to phages (Lima-Mendez et al. 2008). Most recently, Brilli et al. (2008) applied this to plasmids from Enterobacteriaceae family of gamma-Proteobacteria including Escherichia, Salmonella and Shigella genera. The authors stated that 'most of plasmids does not form tight clusters coherent with the taxonomic status of their respective host species (E. coli, Salmonella or Shigella). This finding suggest a complex evolutionary history of such plasmid replicons with massive horizontal transfer and gene rearrangements.'

In contrast to other researchers, Brilli et al. (2008) did not discuss the performance of the phylogenetic profiling methods for plasmids. For example, Snel et al. (1999) demonstrated the correlation of prokarytic phylogeny based on gene content with that based on sequence similarity of 16S rRNA. Also, Lima-Mendez et al. (2008) clustered phage genes based on their phylogenetic profiles to define evolutionary cohesive modules, and showed that in temperate phages evolutionary modules correspond better to functional modules, whereas in virulent phages they span several functional categories. This suggests that the phylogenetic profiling does not always work well at predicting protein function in phages.

This has inspired us to validate the performance of the gene content analysis with the set of genes shared as orthologs by all members of an evolutionarily coherent plasmid group (such as IncFI, IncFII, IncI1, IncN, IncP-1, and IncW), and by focusing on functionally linked proteins such as those involved in the replication, maintenance, and conjugative transfer of plasmids.

Brilli M, Mengoni A, Fondi M, Bazzicalupo M, Lio P, Fani R. BMC Bioinformatics. (2008) 9(1):551. Analysis of plasmid genes by phylogenetic profiling and visualization of homology relationships using Blast2Network.

Snel B, Bork P, Huynen MA. Nat Genet. (1999) 21(1):108-10. Genome phylogeny based on gene content.

Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. Proc Natl Acad Sci U S A. (1999) 96:4285-8. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles.

Kunin V, Ouzounis CA. Bioinformatics. (2003) 19:1412-6. GeneTRACE-reconstruction of gene content of ancestral species.

Lima-Mendez G, Van Helden J, Toussaint A, Leplae R. Mol Biol Evol. (2008) 25:762-77. Reticulate representation of evolutionary and functional relationships between phage genomes.

Dr. Haruo Suzuki
University of Idaho

No comments: