GSCAN--or the GWAS & Sequencing Consortium of Alcohol and Nicotine use--is an international genetic association meta-analysis consortium. Our goal is to aggregate genetic association findings across scores of studies with millions of individuals. GSCAN is composed of three independent but related projects: 1) an exome chip meta-analysis of low-frequency non-synonymous variants, 2) a GWAS meta-analysis, and 3) a whole genome sequencing association meta-analysis. 

This wiki page is to help organize GSCAN efforts for the coordinating investigators. If you represent a study that may be interested in participating in GSCAN you can find more information on our [[http://gscan.sph.umich.edu|more public website]]. Look on the right-hand side of the page to find analysis plans for each of the three projects.


======= Meetings =======

Regular conference calls are held and minutes are [[https://docs.google.com/document/d/1ZK9VIXxcej3lat_oD_oxPP0ajHwj8yX_FKaPh53svVo/edit#|**available here**.]]

Other meeting materials from CO internal meetings are here:

[[gscan_6:16:16_--_db_ga_p_gf_g]]


======= GSCAN Exome Chip =======


====== Phenotype definitions and analysis plan ======

{{:file_gscan_exome_chip_analysis_plan-v2_2.pdfexome_chip_analysis_plan_and_phenotype_definitions}}


====== File Locations ======

**Freeze 1.** We concluded a pilot freeze of the exome chip project in 2015 and are writing up our results now. All of the summary statistics are on twins at /net/twins/svrieze/everything-else/wp/GSCAN/freeze1-25-Mar-2015.

**Freeze 2.** New studies that will be included in Freeze 2 are located on RC at /work/KellerLab/GSCAN/EXOME. Each folder in that directory is the name of a study and includes two subfolders, one for Phenotypes and one for Genotypes. Genotypes are split by chromosome to facilitate analyses.


======= GSCAN GWAS =======


====== Workgroups ======

  * **Phenotype workgroup:** Laura Bierut, Marilyn Cornelis, Dave Hinds, Youna Hu, Jaakko Kaprio, Eric Jorgenson, Dajiang Liu, Matt McGue, Marcus Munafo, Gunter Schumann, Scott Vrieze, Luisa Zuccolo
  * **Analysis workgroup:** Goncalo Abecasis, David Hinds, Youna Hu, Eric Jorgenson, Charles Kooperberg, Pete Kraft, Penelope Lind, Dajiang Liu, Nancy Saccone, Dan Stram, Scott Vrieze, Xiaowei Zhan


====== Phenotype definitions and analysis plan ======

The analysis plan and phenotypes are described in files linked below (makes it easier to keep track of versioning!). Coding of phenotypes is described in the aptly-named "phenotype definitions" file whereas the genome-wide analysis plan is in the all-too-aptly-named "analysis plan" document. Please note that the phenotype definitions document only contains information on how to code the eight smoking/drinking phenotypes. File formats for those phenotypes, which many will recognize as standard pedigree formats, are included in the analysis plan. Everything else should be fairly straightforward.

{{:file_gscan_gwas_analysis_plan-v1_3.docxclick_here_to_find_the_gscan_gwas_analysis_plan}}

{{:file_gscan_gwas_phenotype_definitions-2-24-2016.pdfclick_here_to_find_the_gscan_gwas_phenotype_definitions}}


====== Coordination and organization ======

Progress, internal and external, are tracked in [[https://docs.google.com/document/d/1kWaY40n-bSURoLW7VcU9CFv08zVx360RHmxvL7DIreU/edit|**this Google Doc**]]. More specific progress on internal studies is  [[https://docs.google.com/spreadsheets/d/1canvCaAJW70LjSHidtvwrJgyDMa_ZlT7dvpzOsz6PNY/edit#gid=0|**tracked here**]].

Study contact info is tracked in [[https://docs.google.com/spreadsheets/d/11apZaSyesNy4hl4MIgrKRYSASrwZM2iEJsuuFQByCfI/edit#gid=0|**this Google Sheet**]].

Studies available in dbGaP, along with accession numbers, etc. are tracked in [[https://airtable.com/tblzZUtQWcZSlfjrA/viwhISDznphLfST8m|**this Airtable**]].


====== File locations ======

Study data to which we have direct access are located either on twins or RC. Twins data are organized in the folder /net/twins/svrieze/everything-else/wp/GSCAN/GWAS. Within this folder those studies to which we have raw data access are in the folder //CU_Boulder_samples// (for lack of a better name!). Summary stats generated on these samples are organized within //summary_stats_generated_internally//. Summary stats generated by outside groups and submitted for meta-analysis are organized within //summary_stats_generated_externally//.

On RC the organization is similar. Everything is located within the folder /work/KellerLab/GSCAN/GWAS. Study data to which we have raw data access are in the folder //individual_level_study_data//. Summary stats generated on these samples are organized within //summary_stats_generated_internally//. Summary stats generated by outside groups and submitted for meta-analysis are organized within //summary_stats_generated_externally//.


====== [[gscan_db_ga_p]] & UK Biobank ======

Studies included from dbGaP, and the process by which phenotypes and genotypes were constructed and merged is outlined on the [[:gscan_db_ga_p]] page.


======= GSCAN Sequencing =======


====== TOPMed ======

We hope to update this section with detailed descriptions of how we have conducted phenotype derivations for each TOPMed cohort to which we have access to raw data. 

  *  For now, the R scripts to go from source phenotype file to eventual derived phenotype is located here:
  /net/twins/svrieze/everything-else/wp/GSCAN/TOPMed/README

  *  We're tracking analyses in [[https://docs.google.com/document/d/1HtJY6DzPWqr2XGTAD8HzoiIUoC45nSj116neRb3do3s/edit|**this Google doc**]]


===== Phenotype definitions and analysis plan for external studies =====

Phenotype definitions and analysis plans for the TOPMed studies are {{:file_topmed_smoking_analysis_plan-v0_2.docxcontained_in_this_document}}.

The list of dbGaP studies in TOPMed is in [[https://airtable.com/shryD6CMaM6R5sA3e/tblUKENXX5WmgNXQ8|**this Airtable**]].


======= Authorship guidelines =======


While authorship is decided on an individual basis for each GSCAN paper, typically, authorship is arranged in groups. We hope the GIANT investigators will forgive us for adopting their authorship guidelines. 

  *  A group of 6 or fewer junior investigators who strongly led the efforts, usually starred to denote equal contribution, followed by additional junior investigators who played key, central roles.
  *  In alphabetical order, junior investigators who had substantial individual contributions but not as much as those in Group 1. Typically, these might be lead analysts or other junior investigators who made a sizable contribution such as GWA analyses performed specifically for the paper.
  *  In alphabetical order, junior investigators who had notable individual contributions but not as much as those in Groups 1 or 2. Typically, these might be lead analysts for replication cohorts, providing results for a group of top hits.
  *  In alphabetical order, junior and senior investigators who had contributions worthy of authorship (participating in analysis, phenotype collection, genotyping, oversight of cohorts, etc. that was specific to the paper) but not as much as those in the other groups.
  *  In alphabetical order, senior investigators who had contributions worthy of authorship and contributed more than those in group 4. Typically, these might be a lead PI of a participating cohort who did not participate as strongly in GSCAN activities as those in group 6.
  *  In alphabetical order, senior investigators who participated strongly in GSCAN activities but did not strongly lead/oversee the writing and/or analysis for the paper. Typically, these might be leaders of key GSCAN activities.
  *  The senior investigators who strongly led/oversaw the writing and/or analysis of the paper, including a subset that are co-corresponding authors (usually 6 or fewer).