BME205, Fall 2011, Section 01: Research Paper Assignment

Inversions in Pyrobaculum oguniense

Due: Monday, November 28, 2011


The genome of Pyrobaculum oguniense includes a number of genomic rearrangements that we can infer using paired-end sequencing data.  In class, I described these as:

  • an extra-chromosomal element that we now believe to be a virus;
  • a simple inversion; and
  • two loop inversions.  These feature types are described in the lecture notes.

We are now interested in a more thorough investigation of these variants. 

Here are some examples that you should consider in your paper:


  • How often do these variants occur in the population? We first need to consider that only mappable contigs make it into the from-to table. Reads that cannot be uniquely mapped or have only one end of a pair are discarded. We also may need to consider the length of contigs that make it into the to-from table.   
  • Now that we see the specific regions that are involved in an inversion, can we learn about the biology of those genomic regions?  Might the specific genes at these positions gain a selective advantage resulting from the inversion?  In loop inversions, the genes at the termini are duplicates; is the relative orientation of these genes suggestive of the underlying mechanism?  You might consider examining the suspected mechanism whereby multiple copies of ribosomal RNA genes retain near exact sequences.  Searchc terms such as "Chi-sites", or "biased gene conversion" may be useful.  Is this relative orientation of these terminal genes conserved in this genus?  Is it conserved outside of this genus?  Is there any evidence that the underlying inversion in conserved outside of this species?
  • To what extent is the extra-chromosomal element (ECE) involved in the inversions?  Some viruses integrate at specific location in the genome, usually a specific tRNA gene.  Since these integrations are not present in every cell sequenced, it can yield diversions in the assembly which the genome assembler treats by simply terminating the contig.  Is the Pog ECE  involved in other contig breakpoints?  Does it seem to be integrating in specific genes or sequences?
  • What is known in the literature about the specific loci you have found, the nature of inversions within genomes, or the mechanism of integration implied by your findings? Cite any sources that you use to address this question.



Write a short research paper (less than 5000 words), describing your findings.  Your paper should include an introduction, results, conclusion and references.  You should consider the questions posed above (but not feel limited by them) to explain the underlying viral and evolutionary mechanisms--and the biological implications--yielded by this data.

Some of your conclusions will require a formal mathematical derivation, and you should include this within your paper.  I have previously used numbered equations as a means to both describe and refer to derived results.

You will no doubt require use of the Archaeal genome browser for this work.  Logon credentials will be provided.  I have provided the sequence of contigs used in this assembly, along with the fromTo map discussed in class.  The final Map, derived as part of earlier work is also provided for refrence.