User Tools

Site Tools


lab_2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
lab_2 [2017/04/30 22:19]
scott /* Lab assignment 2 */
lab_2 [2017/05/02 09:09] (current)
scott /* Lab assignment 2 */
Line 75: Line 75:
  
 Example full credit answers Example full credit answers
-1. + 
 +1. Most of you got this one right. The most common mistake was to include too much information and too many steps (although that generally did not cost you any points).
  
 zgrep -w 'rs671' hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz zgrep -w 'rs671' hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz
Line 83: Line 84:
 This individuals has 0 alternate alleles, so their genotype is G/G. Two reference alleles. This individuals has 0 alternate alleles, so their genotype is G/G. Two reference alleles.
  
-2. There are multiple ways to answer this. One of the most straightforward is as follows, although we could quibble over whether should have included any splicing variants.+2. There are multiple ways to answer this. One of the most straightforward was as follows, although we could quibble over whether it should have included any splicing variants.
  
 zgrep 'synonymous\|missense\|start_gain\|start_lost\|stop_gain\|stop_lost\|3_prime_UTR_variant\|5_prime_UTR_variant' hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz | wc -l zgrep 'synonymous\|missense\|start_gain\|start_lost\|stop_gain\|stop_lost\|3_prime_UTR_variant\|5_prime_UTR_variant' hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz | wc -l
Line 89: Line 90:
  
  
-3.+3. One good answer was: 
 +"a) This person is likely to be lactose intolerant. One primary variant for lactose intolerance is rs4988235, while another one is rs182549. For both of these, the genotype to be lactose intolerant is C/C. This person had the C/C genotype for both sites. 
 + 
 + 
 +zgrep rs182549 hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz  
 +2 136616754 rs182549 C T . .  
 +ANN=T|intron_variant|MODIFIER|MCM6|ENSG00000076003|transcript|ENST00000264156 
 +|protein_coding|9/16|c.1362+117G>A||||||,T|intron_variant|MODIFIER|MCM6 
 +|ENSG00000076003|transcript|ENST00000492091|processed_transcript|2/
 +|n.181+3423G>A|||||| GT 0/
 + 
 + 
 +zgrep rs4988235 hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz 
 +2 136608646 rs4988235 G A . .  
 +ANN=A|intron_variant|MODIFIER|MCM6|ENSG00000076003|transcript|ENST00000264156|protein_coding 
 +|13/16|c.1917+326C>T||||||,A|intron_variant|MODIFIER|MCM6|ENSG00000076003|transcript|ENST00000492091 
 +|processed_transcript|3/5|n.343+326C>T||||||,A|intron_variant|MODIFIER|MCM6|ENSG00000076003 
 +|transcript|ENST00000483902|retained_intron|1/1|n.544+326C>T|||||| GT 0/
 + 
 + 
 +Though it looks like at the second site the genotype is G/G, this is reading from the positive strand. The negative strand [which is used on SNPpedia, and is the transcribed strand], would be C/C. 
 + 
 +b) Geographical distribution for the allele frequency of rs182549 
 + 
 +The Minor allele frequency is 0 in Africa, and southern Europe and Asia, while the minor allele is more prevalent (even becomes major) in Eastern Europe and Western US. Minor allele is slightly prevalent in northern South America."
  
  
lab_2.1493612342.txt.gz · Last modified: 2017/04/30 22:19 by scott