This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
keller_and_evans_lab:gscan [2016/08/29 10:24] scott /* GSCAN GWAS */ |
keller_and_evans_lab:gscan [2016/12/19 15:46] scott /* TOPMed */ |
||
---|---|---|---|
Line 35: | Line 35: | ||
The analysis plan and phenotypes are described in files linked below (makes it easier to keep track of versioning!). Coding of phenotypes is described in the aptly-named " | The analysis plan and phenotypes are described in files linked below (makes it easier to keep track of versioning!). Coding of phenotypes is described in the aptly-named " | ||
- | {{file_gscan_gwas_analysis_plan-v1_2.pdfClick | + | {{file_gscan_gwas_analysis_plan-v1_3.docxClick |
{{file_gscan_gwas_phenotype_definitions-2-24-2016.pdfClick here to find the GSCAN GWAS phenotype definitions.}} | {{file_gscan_gwas_phenotype_definitions-2-24-2016.pdfClick here to find the GSCAN GWAS phenotype definitions.}} | ||
Line 56: | Line 56: | ||
- | ====== [[gscan_db_ga_p]] ====== | + | ====== [[gscan_db_ga_p]] |
Studies included from dbGaP, and the process by which phenotypes and genotypes were constructed and merged is outlined on the [[gscan_db_ga_p]] page. | Studies included from dbGaP, and the process by which phenotypes and genotypes were constructed and merged is outlined on the [[gscan_db_ga_p]] page. | ||
- | ====== | + | ======= GSCAN Sequencing ======= |
- | More information about the files used for [[uk_biobank|UKBiobank are here]]. In brief, we used the UK10K + 1kgp3 imputed vcfs provided by UKBionank and added in dosages w/ this python script: | ||
- | import gzip, argparse, re, os, datetime | + | ====== TOPMed ====== |
- | from subprocess import Popen, PIPE | + | |
- | def add_dosage(pair): | + | We hope to update this section with detailed descriptions of how we have conducted phenotype derivations for each TOPMed cohort to which we have access to raw data. For now, the R scripts to go from source phenotype file to eventual derived phenotype is located here: |
- | a, b = pair | + | / |
- | probs = b.split(b' | + | |
- | dose = float(probs[1]) + (float(probs[2]) * 2) | + | |
- | return a + b':' | + | |
- | def gziplines(fname): | ||
- | f = Popen([' | ||
- | for line in f.stdout: | ||
- | yield line | ||
- | parser | + | ===== Phenotype definitions and analysis plan for external studies ===== |
- | parser.add_argument(' | + | |
- | args = parser.parse_args() | + | |
- | flag = False | + | Phenotype definitions and analysis plans for the TOPMed studies are {{file_topmed_smoking_analysis_plan-v0_2.docxcontained in this document}}. |
- | for line in gziplines(args.inputVCF): | + | The list of dbGaP studies |
- | if line.startswith(b'#' | + | |
- | os.write(1, line.rstrip() + b' | + | |
- | if not flag: | + | |
- | os.write(1, b'## | + | |
- | os.write(1, b'## | + | |
- | str(datetime.datetime.now()).encode(' | + | |
- | flag = True | + | |
- | else: | + | |
- | elements = re.split(b' | + | |
- | first8 = elements[:8] | + | |
- | genotypes = elements[10: | + | |
- | form = b' | + | |
- | genotypes_split = zip(genotypes[:: | ||
- | try: | ||
- | dose_genos = [add_dosage(pair) for pair in genotypes_split] | ||
- | except (ValueError, | ||
- | os.write(2, " | ||
- | os.write(2, line + " | ||
- | raise e | ||
- | os.write(1, b' | ||
+ | ======= Authorship guidelines ======= | ||
- | ======= | + | While authorship is decided on an individual basis for each GSCAN paper, typically, authorship is arranged in groups. We hope the GIANT investigators will forgive us for adopting their authorship guidelines. |
- | + | * A group of 6 or fewer junior investigators who strongly led the efforts, usually starred to denote equal contribution, | |
- | ====== TOPMed ====== | + | |
- | + | | |
- | Preliminary | + | |
- | + | | |
- | The list of dbGaP studies | + | * In alphabetical order, senior investigators who participated strongly in GSCAN activities but did not strongly lead/oversee the writing and/or analysis for the paper. Typically, these might be leaders of key GSCAN activities. |
+ | * The senior investigators who strongly led/oversaw the writing and/or analysis of the paper, including a subset that are co-corresponding authors (usually 6 or fewer). | ||