Jeff Lessem (he/him) (jeff.lessem@colorado.edu)
2021-04-26 10:16:15

@Jeff Lessem (he/him) has joined the channel

Michel Nivard (m.g.nivard@vu.nl)
2021-04-30 09:53:40

@Michel Nivard has joined the channel

Andrew Grotzinger (agrotzin@utexas.edu)
2021-04-30 09:53:40

@Andrew Grotzinger has joined the channel

Perline Demange (p.a.d.demange@vu.nl)
2021-04-30 09:53:40

@Perline Demange has joined the channel

Javier de la Fuente (j.delafuente@utexas.edu)
2021-04-30 09:53:40

@Javier de la Fuente has joined the channel

Jackson Thorp (jackson.thorp@qimrberghofer.edu.au)
2021-04-30 09:53:40

@Jackson Thorp has joined the channel

Andrea Allegrini (andrea.allegrini@kcl.ac.uk)
2021-04-30 09:53:40

@Andrea Allegrini has joined the channel

Margherita Malanchini (m.malanchini@qmul.ac.uk)
2021-04-30 09:53:41

@Margherita Malanchini has joined the channel

Mark Adams (mark.adams@ed.ac.uk)
2021-04-30 09:53:41

@Mark Adams has joined the channel

Gunn-Helen Moen (g.moen@uq.edu.au)
2021-05-03 13:13:42

@Gunn-Helen Moen has joined the channel

Test Student (test-student@ibg.colorado.edu)
2021-05-06 11:38:58

@Test Student has joined the channel

Bridget Joyner (bnj13@my.fsu.edu)
2021-05-10 13:00:00

@Bridget Joyner has joined the channel

Sally Kuo (ickuo@vcu.edu)
2021-05-10 13:30:20

@Sally Kuo has joined the channel

Aislinn Bowler (aislinnbowler@gmail.com)
2021-05-10 13:30:27

@Aislinn Bowler has joined the channel

Morgan Driver (driverm@vcu.edu)
2021-05-10 13:31:04

@Morgan Driver has joined the channel

Sarah Brislin (she/her) (sarah.brislin@gmail.com)
2021-05-10 13:31:37

@Sarah Brislin (she/her) has joined the channel

Lisa Dinkler (lisa.dinkler@gu.se)
2021-05-10 13:31:43

@Lisa Dinkler has joined the channel

Katie Bountress (kaitlin.bountress@vcuhealth.org)
2021-05-10 13:32:21

@Katie Bountress has joined the channel

Peter Tanksley (peter.tanksley@austin.utexas.edu)
2021-05-10 13:32:32

@Peter Tanksley has joined the channel

Tong Chen (tuc548@psu.edu)
2021-05-10 13:34:05

@Tong Chen has joined the channel

Charlotte Viktorsson (viktorsson.charlotte@gmail.com)
2021-05-10 13:34:34

@Charlotte Viktorsson has joined the channel

Jacob Kunkel (kunke104@umn.edu)
2021-05-10 13:35:32

@Jacob Kunkel has joined the channel

Matthieu de Hemptinne (matthieu.dehemptinne@gmail.com)
2021-05-10 13:36:00

@Matthieu de Hemptinne has joined the channel

Jay Ross (jay.ross@mail.mcgill.ca)
2021-05-10 13:38:34

@Jay Ross has joined the channel

Sam Freis (she/her) (Samantha.Freis@colorado.edu)
2021-05-10 13:38:41

@Sam Freis (she/her) has joined the channel

Jeremy Elman (jaelman@health.ucsd.edu)
2021-05-10 13:38:56

@Jeremy Elman has joined the channel

Spencer Moore (spmo3925@colorado.edu)
2021-05-10 13:39:52

@Spencer Moore has joined the channel

Maizy Brasher (mabr7162@colorado.edu)
2021-05-10 13:39:53

@Maizy Brasher has joined the channel

Jenny Phan (jphan5@wisc.edu)
2021-05-10 13:39:58

@Jenny Phan has joined the channel

Meng Huang (meng.huang.cn@gmail.com)
2021-05-10 13:41:18

@Meng Huang has joined the channel

Jung Chen (jchen378@ucmerced.edu)
2021-05-10 13:41:58

@Jung Chen has joined the channel

Stephanie Zellers (she/her/hers) (zelle063@umn.edu)
2021-05-10 13:42:17

@Stephanie Zellers (she/her/hers) has joined the channel

Grace Wu (yakew@email.unc.edu)
2021-05-10 13:42:31

@Grace Wu has joined the channel

Gladi Thng (s2124928@ed.ac.uk)
2021-05-10 13:43:47

@Gladi Thng has joined the channel

Zoe Schmilovich (zoe.schmilovich@mail.mcgill.ca)
2021-05-10 13:43:50

@Zoe Schmilovich has joined the channel

Olivia Rennie (olivia.rennie@alum.utoronto.ca)
2021-05-10 13:43:58

@Olivia Rennie has joined the channel

Christina Sheerin (Christina.sheerin@vcuhealth.org)
2021-05-10 13:43:59

@Christina Sheerin has joined the channel

William McAuliffe (williamhbmcauliffe@gmail.com)
2021-05-10 13:44:17

@William McAuliffe has joined the channel

Chloe Myers (cmyer011@ucr.edu)
2021-05-10 13:44:20

@Chloe Myers has joined the channel

Francis Vergunst (he/him) (francis.vergunst@umontreal.ca)
2021-05-10 13:44:33

@Francis Vergunst (he/him) has joined the channel

Ravi Bhatt (ravibot93@gmail.com)
2021-05-10 13:44:48

@Ravi Bhatt has joined the channel

Nathan Bell (n.y.bell@student.vu.nl)
2021-05-10 14:46:30

@Nathan Bell has joined the channel

Emil Uffelmann (e.uffelmann@vu.nl)
2021-05-10 14:46:48

@Emil Uffelmann has joined the channel

Kristen Kelly (k.m.kelly@vu.nl)
2021-05-10 14:47:50

@Kristen Kelly has joined the channel

Benjamin Neale (bneale@broadinstitute.org)
2021-06-07 07:19:41

@Andrew Grotzinger hi

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-07 07:24:02

@Benjamin Neale hi! I put this on the faculty general channel but I'll be intermittently stuck in clinical responsibilities throughout the workshop (but of course there for all of Genomic SEM day) and currently in seminar but will be on in 40 minutes! I'll also be there for the whole afternoon session

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-07 07:24:32

but i'll be sure to log in as soon as i'm done and scream out my favorite beverage and name as a very cool casual way of showing up an hour late and introducing myself

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-07 07:28:05

or you were just saying hi and you can just ignore my explanation

Benjamin Neale (bneale@broadinstitute.org)
2021-06-07 07:53:05

*Thread Reply:* just saying hi 🙂

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-07 07:53:31

*Thread Reply:* cool cool cool

matthew keller (matthew.c.keller@gmail.com)
2021-06-10 07:38:04

*Thread Reply:* LOL. @Benjamin Neale, when you did the @Andrew Grotzinger in the meeting as an example, I knew Andrew would interpret it this way given he’d already said he couldn’t make that meeting

Jeff Lessem (he/him) (jeff.lessem@colorado.edu)
2021-06-08 15:27:18

@Jeff Lessem (he/him) has renamed the channel from "genomicsem" to "day06-genomicsem"

Uku Vainik (ukuvainik@gmail.com)
2021-06-10 07:01:18

Hi genomic SEM crowd! I've been wondering, if the GWAS based genetic correlation approach could be extented to compare phenotypic profiles? For instance, I run a phewas in sample A and another phewas in sample B. Assuming the set of phenotypes is the same in both samples, is there a way to compare the two phewas profiles and conclude that they have a correlation of 0.XX with a p value of 0.XXX?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 07:12:05

*Thread Reply:* So there are interesting similarities between genetic correlation between traits and the correlation in terns of effect size between SNPs (what you propose), but there are also differences. For example 2 SNPS can be entirely uncorrelated (i.e. no LD) and have highly correlated effects on a set of traits. Though if SNPS are in high LD their effects will also be correlated.... Then what I think is really a big issue to work trough is that the correlation between SNPs, in terms of their effect on traits, is entirely dependent on the set of traits, and if we add enough trait unrelated with either SNP, we will eventually dilute the correlation. We dont really have similar issues with LDSC as we know GWAS chips cover and imperfect but fairly complete coverage of common variation.

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 07:14:54

*Thread Reply:* so I guess I am saying the elegance of LDSC is that we know we measure a substantial and quantifiable portion of the entire domain of common genetic variation something we'll never be able to replicate in the domain of phenotypes (of which there are infinite, or at least for most practical proposes infinite)

👍 Andrew Grotzinger
matthew keller (matthew.c.keller@gmail.com)
2021-06-10 07:41:34

*Thread Reply:* Thanks @Michel Nivard. Follow-up question: if we have a phenotypic correlation matrix and a genetic correlation matrix, can we use GenomicSEM to formally test the null that the two correlation matrices are the same?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 07:53:53

*Thread Reply:* no 🙂

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 07:54:12

*Thread Reply:* genetics only for now (and likely in the future)

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-10 08:05:24

*Thread Reply:* that being said, it can be interesting to compare the factor structure in phenotype and genetic space when possible. For example, if you are examining a set of traits that are all within UK Biobank it should be relatively straight forward to fit the model for the genetic correlation matrix using Genomic SEM and to fit a model using standard approaches to the phenotypic correlation matrix. You can then do something like eyeball the differences between standardized factor loadings to get a sense of if/where the factor structure shifts. A great example of this is from the Genomic SEM g-factor paper (de la fuente et al., [2021]) where they plot the two sets of loadings against each other in the attached figure

👍 Katerina Zorina-Lichtenwalter
Uku Vainik (ukuvainik@gmail.com)
2021-06-10 08:05:47

*Thread Reply:* @Michael Neale thanks. The pheWAS analogue might have been confusing, let's forget about the SNPs.

Long story short - i want to compare the personality profiles for phenotypes like it is done in LDSC. I did a crude comparison in this paper , but I'd like to incorporate the SEs somehow. Right now, the p values assume N=30 (30 personality facets) https://www.nature.com/articles/s41562-019-0752-x

Nature Human Behaviour
Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 08:14:40

*Thread Reply:* okay, resample the correlations you are correlating from the distribution: N(cor,se^2)? then you would need to account for the fact the s.e only reflect the uncertainty not the dependence bewteen uncertainties (which do arise).. do you have the raw data?

Uku Vainik (ukuvainik@gmail.com)
2021-06-10 08:15:29

*Thread Reply:* Depends on the dataset. In that NHB paper I don't have the raw data

Uku Vainik (ukuvainik@gmail.com)
2021-06-10 08:21:50

*Thread Reply:* sorry,what does the N() mean here=

Uku Vainik (ukuvainik@gmail.com)
2021-06-10 08:21:51

*Thread Reply:* ?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 08:35:26

*Thread Reply:* normal

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 08:35:37

*Thread Reply:* so rnorm(mean=cor,sd=se)

Uku Vainik (ukuvainik@gmail.com)
2021-06-10 08:56:12

*Thread Reply:* apologies, i'm not fully following. would you have a moment for a zoom anytime tomorrow or next week, before this seminar?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-10 08:58:45

*Thread Reply:* sure, shoot me an email because this is tO far adrift of the workshop MATERIAL I think (but perhaps I misunderstand): m.g.nivard@vu.nl

Uku Vainik (ukuvainik@gmail.com)
2021-06-10 08:59:36

*Thread Reply:* it is a bit different, but definitely inspired by the genetic correlation approach!

Michel Nivard (m.g.nivard@vu.nl)
2021-06-11 03:01:57

Video instruction on how to copy the files and get set for the practical

Michel Nivard (m.g.nivard@vu.nl)
2021-06-13 23:12:08

Hi everyone this is the practical worksheet in PDF format, copy the files using: "cp -r /faculty/andrew/GenomicSEM_practical/ ./" alternatively download the files from box: https://utexas.app.box.com/s/sounavy84gwygj0j2askcyaoo2ostbgu its a light set of files so shouldnt crash the system if you all copy at the same time on Monday but feel free to download earlier.

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 04:44:48

*Thread Reply:* I think it copes everyting necessary. But there is a warning/error associated with it : cp: cannot access'/faculty/andrew/GenomicSEM_practical/.Rproj.user/233136F9/viewer-cache': Permission denied

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 04:52:49

*Thread Reply:* yeah you can safely ignore that warning!

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 04:46:07

Great stuff! Is there any repository of munged gwas summary scores so they could be more easily plugged to followup analyses? Or could genomic SEM be linked up with gwas.mrcieu.ac.uk for API-like access, for instance?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 04:58:53

*Thread Reply:* yeah totally on our wishlist and have spoken to Gibran about it, will have to make sure we code it properly because munged susmtats are heavier then what they usually pull from that server (i..e couple of hundred SNP for MR). Though I feel in many cases it may make sense to run your own munge because you may have specific SNPs you want to keep in or study specific reference LD for example.

Anna Furtjes (anna.furtjes@kcl.ac.uk)
2021-06-14 04:55:32

Hi there 🙂 Thanks for the very clear and interesting videos. I was wondering if you could elaborate on how the off-diagonal in the V matrix is computed and what it exactly represents .. I don't think I can quite grasp it yet. Thanks!

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 05:02:41

*Thread Reply:* because a lot of GWAS have sample overlap the estimates in S are interdependent. To model this we jackknive the estimates in S, which basically means we estimate S 200 times each time omitting one part of the genome in a bootstrap like procedure. this gives us a way to estimate the variance of each element in S ( which is closely related to the variance over the 200 S's) but also the covariance between the elements in S.

Anna Furtjes (anna.furtjes@kcl.ac.uk)
2021-06-14 06:06:07

*Thread Reply:* Okay, thanks! Does that mean we have one off-diagonal element per possible combination of elements in S?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 06:07:08

*Thread Reply:* YES

👍 Anna Furtjes
🔥 Anna Furtjes
Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 06:07:50

*Thread Reply:* caps lock I wasnt shouting

😄 Anna Furtjes
Lucía de Hoyos (Lucia.DeHoyos@mpi.nl)
2021-06-14 05:35:27

Hi, I have a question about today's videos. In which situation do you use the ML estimator and in which situation do you use DWLS and why? Thanks. 🙂

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 05:45:11

*Thread Reply:* so we (almost exclusively) use the DWLS estimator after having evaluated both in the initial genomicSEM paper (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6520146/). It should not matter too much if all your GWAS are similarly powered, but I realise thats not always the case. IFfyour GWASes do vary in sample size it might be worth it to try both and see whether your conclussions remain unchanged regardless.

PubMed Central (PMC)
👍 Abigail ter Kuile, Cato, Lucía de Hoyos
Lucía de Hoyos (Lucia.DeHoyos@mpi.nl)
2021-06-14 05:58:00

*Thread Reply:* Thanks! I am still wondering: so, if your GWAS is not similarly powered you better use DWLS?

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 06:02:08

*Thread Reply:* i would personally still go with DWLS, but it's ultimately up to you in terms of what you think makes the most sense. the reason the two will differ is that ML estimation does not specifically prioritize recapturing the parts of the genetic covariance matrix that are estimated with greater precision (i.e. smaller standard errors), which will reflect those GWAS that are better powered. DWLS on the other hand prioritizes producing model estimates that match the genetic covariance estimates with the smallest standard errors. that does not mean that the model estimated in Genomic SEM using DWLS will just be dominated by the better powered GWAS of a certain trait; if that particular trait is relatively less genetically correlated with the other traits in your model then DWLS will prioritize producing smaller estimates for that trait. so it comes down to, do you think the model should use the information available and prioritize those better powered GWAS (in which case go with DWLS) or do you want it to treat each part of the genetic covariance matrix equally (in which case go with ML).

👍 Michel Nivard, Lucía de Hoyos
Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 06:06:46

*Thread Reply:* but for us at this point we just to go with DWLS (even if the GWAS are not similarly powered) because our stance is the model should produce estimates that reflect that differential power and not ignore it

Lucía de Hoyos (Lucia.DeHoyos@mpi.nl)
2021-06-14 06:25:32

*Thread Reply:* okay, got it, thank you. ☺

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-14 06:09:47

@Andrew Grotzinger Did I understand correctly that the Qsnp estimates don’t reveal details about the specific effects of the SNPs on the indicators, and that they are just testing the common factor vs independent pathways models against each other? If so, would it make sense to specify the independent pathways model (where there are significant Qsnp estimates) to get the path specific estimates? (sorry if you covered something like that and I missed it)

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 06:22:23

*Thread Reply:* it's a great question! since QSNP captures deviations from the factor model, it can be informative to do something like plot the unstandardized factor loadings from the common factor model that does not include the individual SNPs against the SNP-phenotype betas from the GWAS sumstats for the particular SNP that was significant for Q (i.e., not going through the work of pulling the independent pathways estimates from a Genomic SEM model, though they should be similar). within the scatterplot, if you then draw a line of best fit through the estimates a QSNP significant SNP will typically show one or two univariate GWAS estimates that strongly deviate from the line indicating that this trait is driving the QSNP effect. For example, in the attached scatter plot from the g-factor paper from de la Fuente et al. (2020) they see that reaction time (RT) strongly deviates from the line; this reflects a SNP that has a particularly strong association with RT, but RT does not load strongly on the g-factor, so it does not fit the model and is identified as significant for QSNP. if you have a lot of QSNP effects, what I have done is create a table that lists the univariate GWAS betas and Z-statistics for those SNPs, in which case there tends to be some effects that stand out as the outlier in the bunch that are clearly driving the QSNP effect. I've then created a scatterplot for just the top 4 or 5 most significant QSNP effects.

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-14 06:31:51

*Thread Reply:* Cool! Thank you!

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 07:31:07

Is there a good trade-off between reducing the SNP set and computation time while getting sensible results? When playing around with real data, I imagine that trying stuff out will take a lot of time due to processing time. Ideally, I would use a reduced datset to get my syntax right and then run the model with full data, once I am happy with it.

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 07:35:15

for munge, ldsc, and the usermodel and commonfactor functions i would use the full set of SNPs even when in the troubleshooting phase. collectively these will typically only take 20-30 minutes for 5-10 traits. when troubleshooting multivariate GWAS (the userGWAS or commonfactorGWAS) functions I would run sumstats using the full set of SNPs, and then subset out maybe 1,000 SNPs to run before setting up a full run

👍 Uku Vainik
Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 07:52:19

@channel paste your model(s) and fit during the practical as a reply to this!

Cato (c.romero@student.vu.nl)
2021-06-14 07:55:38

*Thread Reply:* F1=~NASCZ+BIP+MDD+INSOM F2=~NAEA+MDD+INSOM F1~~1F1 F2~~1F2 AIC CFI SRMR 75.30192 0.9481178 0.05215512

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 07:57:20

*Thread Reply:* Interestign now try and beat your own model!

Jacob Kunkel (kunke104@umn.edu)
2021-06-14 08:06:40

*Thread Reply:* #Group3 MY.model<-"F1=~NASCZ+BIP+MDD F1~~1F1 INSOM~F1 EA~INSOM EA~SCZ EA~MDD EA~BIP " YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 113.47442 2 2.2873985e-25 139.47442 0.87773131 0.1127415

Giulio Centorame (giulio.centorame@outlook.it)
2021-06-14 08:11:56

*Thread Reply:* F1=~NA**MDD+BIP F2=~NA**BIP+SCZ F1~~1**F1 F2~~1**F2 INSOM~F1 EA~F2 YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 98.9405 2 3.275996e-22 124.9405 0.8936726 0.07281335

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 07:54:52

Group2: mdd.model<-"F1=~NASCZ+BIP+MDD F1~~1F1

INSOM~F1

INSOM~MDD EA~INSOM"

chisq df p_chisq AIC CFI SRMR df 111.3445 5 2.129702e-22 131.3445 0.883358 0.08609289

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 07:56:12

*Thread Reply:* Thats a great start, how about a second model thats only regressions and no latent variables?

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 07:56:41

*Thread Reply:* ok. can we see modification indices to explore further?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 07:58:18

*Thread Reply:* no modindices yet...

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 07:58:39

*Thread Reply:* mdd.path.model<-" INSOM~MDD EA~INSOM"

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 07:58:48

*Thread Reply:* chisq df p_chisq AIC CFI SRMR df 2.427048 1 0.1192573 12.42705 0.9962303 0.01897433

Amelia Edmondson-Stait (amelia.edmondson-stait@ed.ac.uk)
2021-06-14 08:07:39

*Thread Reply:* MY.model2 <-” INSOM~MDD EA~INSOM+BIP+SCZ” chisq df p_chisq AIC CFI SRMR df 26.54096 3 7.3472907e-06 50.54096 0.97417953 0.049227234

Uku Vainik (ukuvainik@gmail.com)
2021-06-14 08:11:15

*Thread Reply:* mediation model! no model fit, though

model <- ' # direct effect EA ~ cMDD # mediator INSOM ~ aMDD EA ~ bINSOM # indirect effect (ab) ab := ab # total effect total := c + (ab)

         perc:=ab/total
     '
Madhur Singh (he/him) (drmadhurbain@gmail.com)
2021-06-14 08:01:05

Group 9: MY.model<-"F1 =~ NASCZ + BIP + MDD F2 =~ NAMDD + INSOM F1 ~~ 1F1 F2 ~~ 1F2 EA ~ F1 + F2"

chisq df p_chisq AIC CFI SRMR df 49.30197 2 1.96885e-11 75.30197 0.9481177 0.05215518

Barbara Molz (Barbara.Molz@mpi.nl)
2021-06-14 08:01:52

Group5

Barbara Molz (Barbara.Molz@mpi.nl)
2021-06-14 08:02:02

"F1=~NASCZ+BIP+MDD F1~~1F1 INSOM~F1 EA~INSOM BIP~~MDD SCZ~~BIP"

Barbara Molz (Barbara.Molz@mpi.nl)
2021-06-14 08:02:26

chisq df p_chisq AIC CFI SRMR df 59.93606 3 6.066072e-13 83.93606 0.9375507 0.06284154

Caitlin Decina (cd629@exeter.ac.uk)
2021-06-14 08:03:32

Group 14 ```> MY.model<-"F1=~NA**INSOM+BIP+MDD

  • F1~~1**F1
  • SCZ~F1
  • EA~SCZ" YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 505.2232 5 5.9537e-107 525.2232 0.4513393 0.1615591```
Jessica Salvatore (jesalvatore@vcu.edu)
2021-06-14 08:05:56

```> MY.model<-"F1=~NA**SCZ+BIP+EA

  • F2=~NA**MDD+INSOM+EA
  • F1~~1**F1
  • F2~~1**F2" > YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 111.4907 3 5.242001e-24 135.4907 0.8810039 0.09761694```
Nathan Bell (n.y.bell@student.vu.nl)
2021-06-14 08:08:55

MY.model&lt;-"F1=~NA**SCZ+BIP F1~~1**F1 F2=~NA**MDD+INSOM F2~~1**F2 EA~F1+MDD+INSOM" chisq df p_chisq AIC CFI SRMR df 49.301941 2 1.9688815e-11 75.301941 0.94811773 0.052155135

Isabella Loft (ilof@regionsjaelland.dk)
2021-06-14 08:09:59

Group 6 MY.model2 &lt;- "F1=~NA**SCZ+BIP+MDD F1~~1**F1 EA~F1 INSOM~EA INSOM~MDD"

Isabella Loft (ilof@regionsjaelland.dk)
2021-06-14 08:10:33

&gt; YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 132.85012 4 9.5671851e-28 154.85012 0.8586731 0.084212699

Sage Hawn (shawn1@bu.edu)
2021-06-14 08:11:30

OUR.model3 <- "F1=~NASCZ+BIP F2=~NAMDD+INSOM F1~~1F1 F2~~1F2 EA~F1+F2 F1~~F2"

Sarah Brislin (she/her) (sarah.brislin@gmail.com)
2021-06-14 08:11:31

Group 4: ```MY.model2<-"F1=~NA**SCZ+BIP+MDD

  • F1~~1**F1
  • SCZ ~~ BIP
  • BIP ~~ MDD
  • INSOM~F1
  • EA~INSOM" > YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 59.93606 3 6.066056e-13 83.93606 0.9375507 0.06284153```
Sage Hawn (shawn1@bu.edu)
2021-06-14 08:11:39

chisq df p_chisq AIC CFI SRMR df 111.49075 3 5.241987e-24 135.49075 0.88100391 0.097616936

Peter Barr (pbarr2@vcu.edu)
2021-06-14 08:11:46

Group 7: ```MY.model<-"F1=~NASCZ+BIP+MDD F2=~NAMDD+INSOM+EA F1~~F2 F1~~1F1 F2~~1F2 "

 chisq df     p_chisq      AIC       CFI       SRMR

df 61.76184 3 2.47033e-13 85.76184 0.9355481 0.06025005```

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-14 08:33:42

Room 14: ```> table(pfactor_GWAS$warning)

                                                   0 
                                                  88

lavaan WARNING: some estimated ov variances are negative 4 ``` is this correct? what are these warnings related to?

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 09:26:17

*Thread Reply:* hi Laura, that is correct. It means that the model is estimating the residual variance (the variance left after accounting for the variance explained by the common factor) of one of the variables as negative. This is what the userGWAS code in the next section is designed to troubleshoot by putting model constraints on each of the residuals to be above .001

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-14 11:01:31

*Thread Reply:* Thank you @Andrew Grotzinger.. so does it mean that the ‘4’ in the output is referring to 4 individual SNPs contributing to the negative residual variance for one of the variables? (or have I got that completely wrong?!)

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 11:02:33

*Thread Reply:* the GWAS functions are running as many separate models as their are SNPS, so that means that 4 SNPs from 4 different models produced the same warning (may be what you were already saying)

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-14 11:03:36

*Thread Reply:* yes, ok, great. thanks so much!

Sage Hawn (shawn1@bu.edu)
2021-06-14 09:00:32

Perhaps a dumb question but could you please clarify the conceptual difference between a PRS and the partitioned heritability via LDSC that goes into genomic SEM? Essentially what is the difference between genetic liability and genetic risk?

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 09:57:20

*Thread Reply:* not dumb at all! just a point of clarification first, but partitioned heritability refers to when you split up the heritability across classes genes (e.g., genes expressed in the brain) whereas LDSC from the practical today is estimating the heritability explained by the SNPs. PRS and heritability are related to the extent that PRS are going to be better powered for traits with higher heritability estimates. They diverge in that heritability is a population level estimate, while PRS is designed to predict individual risk, and as such the PRS is going to be really bad at predicting risk for certain individuals. Take as an example a individual genetic variant that is super rare in the population such that it does not have a large effect on heritability at a population level, but for those individuals that carry the risk conferring variant it has a huge effect on the outcome. In this case the PRS is going to miss that piece of the puzzle entirely. The two papers below do a nice job of talking about some of these considerations: Harden, K. P., & Koellinger, P. D. (2020). Using genetics for social science. Nature human behaviour4(6), 567-576.

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 09:57:28

*Thread Reply:* Torkamani, A., Wineinger, N. E., & Topol, E. J. (2018). The personal and clinical utility of polygenic risk scores. Nature Reviews Genetics19(9), 581-590.

Sage Hawn (shawn1@bu.edu)
2021-06-15 17:01:00

*Thread Reply:* Hi there @Andrew Grotzinger! Thanks so much for taking the time to provide such a helpful response. Yes, I think I was hearing Michel talk about partitioned heritability when I was writing out my question - my mistake!

Barbara Molz (Barbara.Molz@mpi.nl)
2021-06-14 09:46:50

I just wanted to know what's best practice when you have GWAS sumstats of relatively small sample sizes (10-30k), with significant heritability, but rather large SE, as the model fit indices don’t give SE’s or 95% CIs. Are there any aspects of the model that you can use to determine whether the model is reliable, besides the model fit indices?

Michel Nivard (m.g.nivard@vu.nl)
2021-06-14 09:53:30

*Thread Reply:* we could think about implementing bootstapped confidence intervals on the fit statistics, before any type of fit comes into play though, if you have a small sample some of the first considerations are: 1.What does the model teach you about the relation between traits that you cannot learn without the model and 2. are you convinced of all the causal assumptions and directions implied in your model? if you cant articulate an awnser to 1 and your awnser to 2 is "no" then should you proceed? there is the quote from George Box that goes: "All models are wrong, some models are usefull".

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 10:00:16

*Thread Reply:* you can also try doing what Michel suggested during the practical today and splitting by odd and even chromosomes and re-running the model. if the power of the GWAS is low enough that the model completely shifts in terms of the theoretical conclusions drawn then you might be more tentative about your results. of course, this assumes that LDSC still produces estimates within the realm of possibility when you split a small sample size GWAS across odd/even (e.g., no negative heritability estimates)

Barbara Molz (Barbara.Molz@mpi.nl)
2021-06-16 09:31:47

*Thread Reply:* Many thanks for the answers, makes a lot of sense and we'll give the splitting across odd /even a try. One thing we also wondered is how to compare between different exploratory models and further determine what factor loadings to include then in my confirmatory model ?

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-16 11:31:20

*Thread Reply:* in most cases when you run an exploratory model you will use some sort of cut-off to determine the number of factors to retain and an additional cut-off based on the loadings to determine what items to include on which factors. These decision points can frankly be somewhat arbitrary, but I'll offer a few options that I've seen. As far as how many factors to retain, one approach is to stop including additional factors at the point at which the additional factors are only explaining so much additional variance across the items. There are lots of formal tests out there for (Kaiser rule, optimal coordinates, acceleration factor, scree test) that use some version of this principle to land on the number of factors, or you might say that at the point where my factors are explaining, just as an example, < 20% of the variance across items then I'll stop adding additional factors. As far as factor loading cut-offs, I've seen standardized loading cut-offs as low as .2 and as high as .7, but have personally used around .35 in the work I've done.

Barbara Molz (Barbara.Molz@mpi.nl)
2021-06-17 02:45:20

*Thread Reply:* Thanks so much for the great suggestions, will give this a go!

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 12:48:25

just putting at the front of the messages here for the Session B crew that the files for the Genomic SEM tutorial can be copied using: "cp -r /faculty/andrew/GenomicSEM_practical/ ./" the files can also be download from box: https://utexas.app.box.com/s/sounavy84gwygj0j2askcyaoo2ostbgu. the pdf for the tutorial is also attached. see everyone in a few hours!

Alejandra Medina-Rivera (amedina@liigh.unam.mx)
2021-06-14 16:56:45

Model for Room3

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 16:59:27

*Thread Reply:* great! also post your model fit if it converged

Alejandra Medina-Rivera (amedina@liigh.unam.mx)
2021-06-14 17:18:58

*Thread Reply:* We updated the model as it did not converged

Alejandra Medina-Rivera (amedina@liigh.unam.mx)
2021-06-14 17:19:14

*Thread Reply:* YourModel2$modelfit chisq df p_chisq AIC CFI SRMR df 0.4441685 1 0.50511733 18.444168 1 0.0093329907

Ravi Bhatt (ravibot93@gmail.com)
2021-06-14 16:59:15

Group 10 ```MY.model<-"F1=~NASCZ+BIP+MDD F2=~NA_2SCZ+BIP+MDD F2~~1F2 F1~~1F1 INSOM~F2 EA~F1"

> YourModel$modelfit chisq df p_chisq AIC CFI SRMR df 17.21888 1 3.331094e-05 45.21888 0.9822106 0.03500922```

Jared Balbona (jaba5258@colorado.edu)
2021-06-14 17:05:33

Group 1:

Jared Balbona (jaba5258@colorado.edu)
2021-06-14 17:05:44

MY.model2&lt;-"F1=~NA**SCZ + a**SCZ + a**BIP F2 =~ NA**BIP + MDD + INSOM EA ~ F1 + F2 F1~~1**F1 F2~~1**F2 F1~~F2"

Jared Balbona (jaba5258@colorado.edu)
2021-06-14 17:06:39

*Thread Reply:* &gt; YourModel2$modelfit chisq df p_chisq AIC CFI SRMR df 108.97976 3 1.8192433e-23 132.97976 0.88375804 0.095027369

Zoe Schmilovich (zoe.schmilovich@mail.mcgill.ca)
2021-06-14 17:06:01

Group 11: MY.model &lt;-"F1=~NA**BIP+SCZ+INSOM F1~~1**F1 EA~F1 INSOM~MDD " Model fit: chisq df p_chisq AIC CFI SRMR df 404.69035 5 2.8928548e-85 424.69035 0.56160695 0.15616543

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 17:06:20

also share your model fit if you got the model to run

Geng Wang (geng.wang@uq.edu.au)
2021-06-14 17:15:40

Group9 MY.model&lt;-"F1=~NA**SCZ+BIP+MDD+INSOM F1~~1**F1 EA~F1 SCZ~~BIP"

Geng Wang (geng.wang@uq.edu.au)
2021-06-14 17:16:22

chisq df p_chisq AIC CFI SRMR df 155.6856 4 1.23045e-32 177.6856 0.8336264 0.1289839

Emily Daubney (emily.daubney@hotmail.com)
2021-06-14 17:19:53

Group 4

Emily Daubney (emily.daubney@hotmail.com)
2021-06-14 17:20:11

M3.model &lt;- "SCZ~INSOM BIP~INSOM MDD~INSOM INSOM~EA SCZ~~BIP SCZ~~MDD MDD~~BIP "

Emily Daubney (emily.daubney@hotmail.com)
2021-06-14 17:20:22

&gt; YourModel3$modelfit chisq df p_chisq AIC CFI SRMR df 59.936037 3 6.0661301e-13 83.936037 0.93755075 0.062841542

Kiana Jodeiry (kiana.jodeiry@emory.edu)
2021-06-14 17:21:47

mod1 <- "F1=~ NASCZ+BIP F1~~1F1 F1~~MDD INSOM ~ MDD + F1 EA ~ INSOM"

Stephanie Zellers (she/her/hers) (zelle063@umn.edu)
2021-06-14 17:21:51

Group 8 MY.model2<- "F1=~NASCZ+BIP+EA F2=~NABIP+MDD+INSOM F1~~1F1 F2~~1F2 F1~~0**F2" &gt; YourModel2$modelfit chisq df p_chisq AIC CFI SRMR df 425.08459 4 1.0556926e-90 447.08459 0.53814106 0.13811987

Katerina Zorina-Lichtenwalter (kazo7929@colorado.edu)
2021-06-14 17:21:53

our_model5 <- 'F1 =~ SCZ + BIP + EA F2 =~ MDD + INSOM + EA F1 ~~ 0**F2 '

Kiana Jodeiry (kiana.jodeiry@emory.edu)
2021-06-14 17:22:06

The model fit is: Model1$modelfit chisq df p_chisq AIC CFI SRMR df 66.58176 4 1.1943541e-13 88.58176 0.93135834 0.061508945

👀 AleRuiz
Penelope Lind (penelope.lind@qimrberghofer.edu.au)
2021-06-14 18:26:31

I was wanting to work out how to create the anthro.model from your tutorial PPT (slide 57 with the Overweight and Early Life factors). Could you please let me know what I have coded incorrectly as the following model won't converge. covstruc<-anthro anthro.model<-"Overweight=~NABMI+WHR+CO+Waist+Hip EarlyLife=~NAHip+Height+IHC+BL+BW Overweight~~1Overweight EarlyLife~~1EarlyLife Overweight~~EarlyLife BL~~1BL BW~~1BW IHC~~1IHC Height~~1Height Hip~~1Hip Waist~~1Waist CO~~1CO WHR~~1WHR"

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-14 20:01:31

*Thread Reply:* you'll want to take out the pieces where you fix the residual variances of the indicators to 1. I can see why you have added that in based on the path diagram; path diagrams are typically depicted this way with a latent factor for the residual variance because this is not something that is actually observed in the data, and so it's common to show it as a latent with a factor loading fixed to 1 and the variance of that latent shown that depicts the residaul. so to run that model you would write: "Overweight=~NABMI+WHR+CO+Waist+Hip EarlyLife=~NAHip+Height+IHC+BL+BW Overweight~~1Overweight EarlyLife~~1EarlyLife Overweight~~EarlyLife"

Penelope Lind (penelope.lind@qimrberghofer.edu.au)
2021-06-14 20:41:36

*Thread Reply:* Thanks so much Andrew! This corrected model aligns to the path diagram. Could you also please let me know the consequence of this warning/note from running the model? ""The S matrix was smoothed prior to model estimation due to a non-positive definite matrix. The largest absolute difference in a cell between the smoothed and non-smoothed matrix was 0.000128450830733567 As a result of the smoothing, the largest Z-statistic change for the genetic covariances was 0.0157580978138476 . We recommend setting the smooth_check argument to true if you are going to run a multivariate GWAS.""

Penelope Lind (penelope.lind@qimrberghofer.edu.au)
2021-06-14 20:41:54

*Thread Reply:* I apologise if this has been covered in the online videos

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-15 08:49:02

*Thread Reply:* no it wasn't covered so I'm glad you asked! That warning reflects the fact that the genetic covariance (S) matrix is non-positive definite for this set of traits, so it was smoothed to positive definite prior to running the model. In this case, the largest difference in the genetic covariance matrix pre and post-smoothing is very small (.0001) so it's not something to be overly concerned about. We write on the github towards the bottom of one of the wiki pages (https://github.com/GenomicSEM/GenomicSEM/wiki/3.-Models-without-Individual-SNP-effects) that if the difference is > .025 this would be an interpretive concern and since smoothing at that level is going to oftentimes be caused by less well-powered GWAS being included in the model you might consider excluding those to see how estimates change.

GitHub
Penelope Lind (penelope.lind@qimrberghofer.edu.au)
2021-06-15 19:55:53

*Thread Reply:* Thank you!

Giacomo Bignardi (giacomo.bignardi@maxplanckschools.de)
2021-06-15 05:28:33

Hi, thanks a lot for all the time you put into answering my question yesterday.

I still have some questions. I was thinking about model identification. I am still not 100 % clear about what we can infer from a saturated model (if I understand correctly, the GWAS by subtraction in @Perline Demange et al. is an example of a saturated model. 3x3 varcovar matrix, 6 observed 6 estimated).

Is it the case that in a saturated model, being a perfect match of the data (?), all the fit indices become meaningless? And if so, is the interpretability of the results only conditional to the assumption that the model’s specification is (very) good?

In sum, I would like to understand how to evaluate results obtained from a saturated model (e.g., should I look at the path estimates and their se?).

Thanks!

Michel Nivard (m.g.nivard@vu.nl)
2021-06-15 06:17:58

*Thread Reply:* you can't distinguish between saturated models, or test their fit based on the data you have in hand. In the cogNonCog paper we do a lot of sensitivity analyses where we test whether specific violations of the model (reverse causation or cog and noncog being correlated) would influence our findings.

👍 Giacomo Bignardi
Giacomo Bignardi (giacomo.bignardi@maxplanckschools.de)
2021-06-15 06:33:26

*Thread Reply:* I was writing a long reply but I got it 🎉(for the cog non cog example), thanks. Last question (I hope): in genomic SEM, if you were about testing the effect of only one SNP, would you base the interpretation of your results on the significance of the path estimate ?

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-15 08:57:47

*Thread Reply:* If you are running a model in which the SNP is predicting a factor, I would interpret the results based on two findings: the significance of the effect of the SNP on the factor and the significance of QSNP. If you have a significant SNP effect and not a significant QSNP that reflects a SNP that is likely to operate through the general pathways of the latent factor. If you have something that is significant for both Q and the latent factor that reflects a SNP that does not conform to the factor loadings, and oftentimes is observed when a SNP has a much larger effect on one of the traits or the SNP has directionally opposing effects on the traits. The commonfactorGWAS function automatically produces Q, and if the model is being run in the context of userGWAS you would compute Q via the chi-square difference across two models: a model in which the SNP predicts only the latent factor (i.e., common pathways model) against a model in which the SNP directly predicts the individual traits that define the factor (i.e., independent pathways model)

👍 Giacomo Bignardi
Laura (lhaver01@mail.bbk.ac.uk)
2021-06-15 09:12:22

*Thread Reply:* @Andrew Grotzinger I’m trying to get my head about thinking about these models where SNPs are predicting a common factor - and trying to understand how this works given that the SNPs are also indirect indicators the common factor.. (at least I think they are?!)

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-15 09:19:44

*Thread Reply:* is your question relating to the fact that ld-score regression uses the SNPs to produce the genome-wide genetic covariance estimates, and then we specify an individual SNP to predict a factor that is defined by the ld-score regression genetic covariance results (i.e., are we somehow "double dipping" from the SNP effects in some way?)

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-15 09:30:27

*Thread Reply:* yes! I wasn’t sure how to articulate it but that perfectly sums up my question

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-15 10:19:50

*Thread Reply:* great question! so i think there's two pieces to answering this,. The first is what the SNP-level analyses are offering aside from just modeling the factor. The SNP-level estimates are useful because in many cases the SNPs may not fit the factor model. For example, it could be that the genetic covariance between trait 1 and trait 2 operate through very different biological pathways then the covariance between trait 2 and trait 3. In this case, even though you can model a common factor, the SNPs are generally not going to operate through the factor, which is something you can assess using the QSNP statistic. This is a fairly unique advantage to Genomic SEM, namely the ability to stress test the factors and see whether they have utility for understanding shared biological processes at different levels of analysis (e.g., the SNP-level). The second piece to this is whether or not double dipping on the SNP effects would bias estimates in some way, which is to say is it statistically appropriate? The short answer is that because the ld-score regression results are produced for approximately 1.1 million SNPs, and you are only including one SNP in the model, the dependency between these estimates is effectively 0. We talk more about this in the method section of the original Genomic SEM paper: Grotzinger, A. D., Rhemtulla, M., de Vlaming, R., Ritchie, S. J., Mallard, T. T., Hill, W. D., ... & Tucker-Drob, E. M. (2019). Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. _Nature human behaviour3(5), 513-525.

Laura (lhaver01@mail.bbk.ac.uk)
2021-06-15 10:23:37

*Thread Reply:* what a great answer, thank you so much.

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-15 10:43:15

*Thread Reply:* of course!

Abigail ter Kuile (k1456980@kcl.ac.uk)
2021-06-17 05:23:40

Does using summary statistics from more recent LMM methods such as ReGenie, FastGWA and FastGWA-GLMM cause any issues in Genomic SEM?

🔥 Anna Furtjes
Mark Adams (mark.adams@ed.ac.uk)
2021-06-17 05:40:20

*Thread Reply:* These can be used in GenomicSEM but the thing to check for is inflated LDSC intercepts for these sumstats (check values of diag(covstruc$I) that are well above 1). It's mostly an issue with highly heritable traits from large cohorts.

Michel Nivard (m.g.nivard@vu.nl)
2021-06-17 05:40:47

*Thread Reply:* yes/no/maybee? So we have been warning ppl that the LDSC model for heritability (but less for genetic correlation) breaks down slightly when you use LMM, though in practice we havent noticed very pronounced differences other then for highly heritable traits in large cohorts

Michel Nivard (m.g.nivard@vu.nl)
2021-06-17 05:41:09

*Thread Reply:* haha ok well at least we seem to be on the same page among ourselves 🙂

🧑‍🤝‍🧑 Mark Adams
❤ Abigail ter Kuile
Abigail ter Kuile (k1456980@kcl.ac.uk)
2021-06-17 05:59:03

*Thread Reply:* great thank you 🙂 I'm guessing the same applies to SAIGE?

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-17 06:26:48

*Thread Reply:* i've looked at SAIGE using the european subsample from the panUKB analyses, FastGWA, and Hail results in UKB for BMI and found h2 estimates that were all within 1-2% of each other so it seems fine enough. I will flag that the intercept problem brought up above was the most pronounced for SAIGE, and we had someone recently reach out using SAIGE estimates who didn't set the intercept to 1 and were finding severely deflated signal for multivariate GWAS results.

Alex Miller (apmfz5@mail.missouri.edu)
2021-06-18 09:13:33

Hello, I have a couple of questions about some GenomicSEM models I am estimating. The first is about allowing residuals to correlate across factors in a userGWAS model. I am specifying a correlated two-factor model with 3 indicator GWAS for each factor. I have reason to believe that it may be appropriate to account for residual correlation between some indicators on separate factors. This model has good fit without SNPs (X2 = 6.47, df = 5, P = 0.26, AIC = 38.47, CFI = 0.985, SRMR = 0.088), but when I fit this model to SNP-level data I get the following warning for about half of my SNPs: “Covariance matrix of the residuals of the observed variables (theta) is not positive definite.” Any ideas about what may cause this discrepancy between the model without SNPs and that with SNPs?

The second is about a GWAS-by-subtraction model. When I fit a GWAS-by-subtraction model for two of my traits, one of my resulting sets of summary statistics has almost no variants with p-values less than 0.05 similar to this post on the GenomicSEM Google group : https://groups.google.com/g/genomic-sem-users/c/PAubVTiI6Is. Any Ideas about why this might occur?

I will post my questions to the GenomicSEM Google group as well in case these questions are more appropriate for that platform. Thank you!

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-22 13:12:53

*Thread Reply:* Hi Alex! I would suggest saving the full model output for a handful of SNPs that are producing that warning and comparing to the model output for the model without individual SNP effects. If it doesn't look like it's producing estimates that are any different than the model without individual SNPs, then that is some indication that this may be a relatively benign warning since the model without SNPs is not producing that warning about the non-positive definite theta matrix. The most common culprit for a theta (i.e., residual) matrix type warning or error is a negative residual variance, which you can check when looking at the runs for the individual SNPs. This can also occur in the specific context of estimating residual covariances. This reflects a somewhat less common situation where you are estimating residual covariances between some variables that have very little residual variance to begin with after accounting for the variance explained by the factors. My sense is the userGWAS model is then throwing this error when the variance explained by the SNP via the factor just pushes that theta matrix to be non-positive definite. In the absence of negative residuals, a misspecified model, or really bizzare model estimates I would consider this a safe warning to ignore (particularly because you do not see the warning in the model without SNPs. 

For your question about the p-values > .05 for the GWAS-by-subtraction model, someone else ran into that same issue (https://github.com/GenomicSEM/GenomicSEM/issues/38) and found that the culprit was misspecified arguments for the sumstats function. You might check that the p-values from your sumstats output are not also attenuated first, which would be some indication that the arguments are off there.

GitHub
Alex Miller (apmfz5@mail.missouri.edu)
2021-06-22 13:35:26

*Thread Reply:* Thanks so much Andrew! I had similar thoughts about the first issue, so that’s encouraging! I’ll definitely look into the sumstats I used for the GWAS-by-subtraction to make sure I haven’t misspecified something. Thanks again!

Andrew Grotzinger (agrotzin@utexas.edu)
2021-06-22 13:59:03

*Thread Reply:* of course!