Differences

This shows you the differences between two versions of the page.

--- lab_1 [2017/04/13 13:27]
scott /* Lab 1 Assignment */
+++ lab_1 [2017/04/25 10:49] (current)
scott /* Lab 1 Assignment */
@@ Line 19: / Line 19: @@
 ### Question 1 (4 points)
 ### a) What does "positive strand" mean in the header of the genotype file?
 ### Question 2 (2 points)
@@ Line 45: / Line 44: @@
 ###    the website  http://popgen.uchicago.edu/ggv/
+Example full credit answers:
+  - Question 1
+    - "The positive strand refers to the leading strand of DNA being sequenced (eg. the strand that RNA would be replicated against)."
+    - "Each DNA strand is a double helix - it has two strands. The first strand given is the postive strand; the second strand is based on the first and is called the negative strand. For example, if the positive strand is ATCGG, then the negative strand is TAGCC (T always pairs with A, and G always pairs with C). The header is stating that the genome provided is only based on the first strand (the positive strand)."
+  - Question 2
+    - awk '{print $2}' hu916767_20170324191934.txt
+    - cut -f2 hu916767_20170324191934.txt
+  - Question 3
+    - awk '{print $2}' hu916767_20170324191934.txt | sort -u
+    - cut -f2 hu916767_20170324191934.txt | sort -u
+    - The command extracts the second column from a tab-delimited file, alphanumerically sorts it, and removes all duplicate lines.
+  - Question 4
+    - grep 'rs671' hu916767_20170324191934.txt
+    - Output: rs671 12 112241766 GG
+    - "Interpretation: This individual does not flush, has a normal risk for alcoholism, normal risk of esophageal cancer, and Disulfiram is effective for alcoholism for this individual."
+  - Question 5
+    - Minor allele is A in individuals of European ancestry and MAF is .36
+    - In individuals of African ancestry MAF is .021
+    - The SNP is associated with thinking cilantro tastes like soap
+    - "The minor allele is most common in central/southern Asia and western Europe, and least common in African with the Americas in between."
@@ Line 161: / Line 181: @@
 ### We can also grab both variants, if we wanted to
 grep -E 'rs8176719|rs9430244' hu916767_20170324191934.txt
+### What if we have a variant where we don't know the rsID,
+### but only the chromosome, position, genome build, and alleles?
+### Well, to get chromosome 1, position 11850759, we can do this:
+grep -E '\s1\s11850750\s' hu916767_20170324191934.txt
@@ Line 195: / Line 221: @@
 grep -E '1' hu916767_20170324191934.txt
+====== Useful databases ======
+**Geography of Genetic Variants Browser** Interactively browse geographic distribution of genetic variants. Can compare to 1000 Genomes, ExAC, and POPRES (Euro-centric). http://popgen.uchicago.edu/ggv/?data=%221000genomes%22&chr=11&pos=6889648
+**dbSNP** A fairly exhaustive database of SNPs in humans. https://www.ncbi.nlm.nih.gov/projects/SNP/
+**ExAC** A good source for exonic variants. Very user friendly. http://exac.broadinstitute.org/

IBG Wiki

User Tools

Site Tools

Differences

Page Tools