User Tools

Site Tools


catslife:instruments:polygenic_scores_pgss

Polygenic Scores (PGSs) Computed for CATSLife Subjects

Daniel Gustavson has computed polygenic scores (PGSs) using a standardized pipeline.

THIS PAGE IS STILL UNDER CONSTRUCTION. IF YOU HAVE QUESTIONS PLEASE EMAIL daniel.gustavson@colorado.edu

Information About the PGS Pipeline

Plink files for computing PGSs in CATSLife can be found on Research Computing: /pl/active/IBG/data/CATSLife/derived/PGS/

We started with the lightly QC'd genetic datafiles from Luke Evans that was imputed to HRC. Additional QC was then applied to remove INFO/R2<.80, remove multiallelic variants, and rename to RSID (build 37). Additional filters in plink included: –geno .05 –maf .01 –mind .05 –hwe 1e-10

PGSs were then computed using PRScs (Ge et al., 2019) and/or SbayesR (Lloyd-Jones et al., 2019) software. See the table below for which specific traits have been scored. Example scripts for generating your own scores are provided on RC, but please contact Daniel Gustavson if you have any questions and/or want to request specific scores generated in the future.

Ancestry and Principal Components

Information about Genetic Ancestry and Principal Components can be downloaded here and can also be found on RC: /pl/active/IBG/data/CATSLife/derived/PGS/

The current file includes important pieces of information recommended for PGS analyses.

  • Conversion from genetic IIDs to standard CATSLife IDs needed to merge PGS output with phenotypic information.
  • Continental ancestry group. These were determined based on mapping CATSLife data to the 1000 Genomes Phase 3 reference panel. EUR (European-like), AFR (African-like), SAS (South Asian-like), and EAS (East Asian-like) groups were classified if they were within 5 SDs of the population means based on the first 4 principal components. Because there is a lot more variability in the AMR (American-like) group, this based on 2 SDs on the first 4 principal components and required that they not already be classified in one of the other groups.
  • Two sets of PCs are provided in this datafile. Principal components labeled “PC1”, “PC2” (etc.) are derived from a PCA within this sample. Principal components labeled “PC1_1KG”, “PC2_1KG” (etc.) are derived from mapping our data onto 1000 Genomes. For polygenic scores involving CATSLife data, it is recommended that you use the within-sample PCs, though there may be exceptions.

READMEs for Computing Polygenic Scores in CATSLife

Readme for PRScs (Ge et al., 2019)

Readme for SbayesR (Lloyd-Jones et al., 2019)

Polygenic Scores Currently Available

TRAIT CITATION PRScs SbayesR
Alzheimer's Disease Bellenguez 2023 X X
Alzheimer's Disease Kunkle 2019 X
Educational Attainment Okbay 2022 X X
Executive Function Hatoum 2023 X
Impulsivity (3 BIS subscales) Sanchez-Roige 2023 X*
Impulsivity (5 UPPS subscales) Sanchez-Roige 2023 X*
General Cognitive Ability Davies 2018 X
Intelligence (Childhood) Benyamin 2014 X
Neuroticism Baselmans 2019 X
Frailty Atkins 2021 X

* (Restricted to 23andMe agreement, contact Dan Gustavson & Naomi Friedman)

Download merged PGSs computed using PRScs

catslife/instruments/polygenic_scores_pgss.txt · Last modified: 2025/04/18 13:18 by gustavsd