Differences

This shows you the differences between two versions of the page.

--- keller_and_evans_lab:meeting_notes [2017/09/06 12:14]
richard_border
+++ keller_and_evans_lab:meeting_notes [2017/09/06 12:14]
richard_border
@@ Line 11: / Line 11: @@
 Overall study structure
 k px has subcomponents:
-    - phenotyping changed over course of study (eg personality only available for a subset)
+  - phenotyping changed over course of study (eg personality only available for a subset)
-    - QC datafile contains batch variable for every individual- see if it contains "BiLEVE"
+     - QC datafile contains batch variable for every individual- see if it contains "BiLEVE"
-    - differences between online/in person data
+     - differences between online/in person data
-    - two genotypings
+     - two genotypings
-    - 50k on one of the chips where half heavy smokers
+     - 50k on one of the chips where half heavy smokers
-    - two affy arrays but there are sig difs in call rates for particular SNPs
+     - two affy arrays but there are sig difs in call rates for particular SNPs
 - phenotyping confounding with snp arrays and ascn for heavy smoking
   - smoking also confounded with batch
-   - Phenotype data available as .csv and .Rdata file generated by provided R script; possible for SAS as well
+    - Phenotype data available as .csv and .Rdata file generated by provided R script; possible for SAS as well
-   !! rdata file is large and will excede memory allocated to login nodes
+    !! rdata file is large and will excede memory allocated to login nodes
-        - object is `bd`
+         - object is `bd`
-   - each "project"/request has it's own file as IDs have been randomized; `f.eid` is randomized day linking phen/gene data within requests; can establish bijection between eids across projects via plink sample files (ie eid1 <-> pos <-> eid2)
+    - each "project"/request has it's own file as IDs have been randomized; `f.eid` is randomized day linking phen/gene data within requests; can establish bijection between eids across projects via plink sample files (ie eid1 <-> pos <-> eid2)
-   - f.50.0.0 : 0 is initial visit; 1: reax (-20k indiv); 2: imaging visit;
+    - f.50.0.0 : 0 is initial visit; 1: reax (-20k indiv); 2: imaging visit;
 - 50^ is var id
 - details on phenotype page on wiki
@@ Line 29: / Line 29: @@
 Phenotypes available
-    - psychiatric sx data (now available) -- need to submit additional application if interested in using (particularly suicide)
+  - psychiatric sx data (now available) -- need to submit additional application if interested in using (particularly suicide)
-    - wiki with list of fields out to email
+     - wiki with list of fields out to email
-    - data on rc `/work/ibg/` but some still in kellerlab still waiting on data availability
+     - data on rc `/work/ibg/` but some still in kellerlab still waiting on data availability
-    - for storage, important to use generic bgen files
+     - for storage, important to use generic bgen files
 Data cleaning - need to ensure consistency across projects
- -  genotype data
+  -  genotype data
-  - vcf files,
+   - vcf files,
-  - ld-pruned relatedness files
+   - ld-pruned relatedness files
-  - gargi will send out parameters (HWE, MAF cutoffs, etc) of cleaned files and location on directory (discussed previously by gargi and luke)
+   - gargi will send out parameters (HWE, MAF cutoffs, etc) of cleaned files and location on directory (discussed previously by gargi and luke)
- - QC
+  - QC
 - raw data will remain available
 - one set of files that have a bare min of QC (e.g., for imputed data, info score >=.3, removing indels, individs whose self-rep vs genetic sex differs excluded, singleton doubleton excld, two phases of imputation with some error--should use HRC snps, so luke removed uk10k and 1kg only snps)

IBG Wiki

User Tools

Site Tools

Differences

Page Tools