@Jeff Lessem (he/him) has joined the channel
@Elizabeth Prom-Wormley has joined the channel
@Alex Bloemendal (he/him) has joined the channel
@John Compitello has joined the channel
@Christiaan de Leeuw has joined the channel
@Valentin Hivert has joined the channel
@Jackson Thorp has joined the channel
@Tetyana Zayats has joined the channel
@Emma Anderson has joined the channel
@Javier de la Fuente has joined the channel
@Margherita Malanchini has joined the channel
@Andrea Allegrini has joined the channel
@Baptiste Couvy-Duchesne has joined the channel
@Daniel Goldstein has joined the channel
@Rob Kirkpatrick has joined the channel
@Rebecca Richmond has joined the channel
@Abdel Abdellaoui has joined the channel
@Adrian Campos has joined the channel
@Andrew Grotzinger has joined the channel
@Benjamin Neale has joined the channel
@Sarah Medland (she/her) has joined the channel
@Michael Neale has joined the channel
@Katrina Grasby has joined the channel
@Tim Poterba (he/him) has joined the channel
@Kumar Veerapen has joined the channel
@Lucía Colodro-Conde has joined the channel
@matthew keller has joined the channel
@Test Student has joined the channel
@Sarah Brislin (she/her) has joined the channel
@Katie Bountress has joined the channel
@Peter Tanksley has joined the channel
@Charlotte Viktorsson has joined the channel
@Matthieu de Hemptinne has joined the channel
@Sam Freis (she/her) has joined the channel
@Stephanie Zellers (she/her/hers) has joined the channel
@Zoe Schmilovich has joined the channel
@Olivia Rennie has joined the channel
@Christina Sheerin has joined the channel
@William McAuliffe has joined the channel
@Francis Vergunst (he/him) has joined the channel
Thanks for uploading the video. Are those subtitles edited at all? They don't have any timing information, so if they're purely auto generated, then I'll generate them again with timing.
*Thread Reply:* I did give an edit to them
*Thread Reply:* but if you want to autogenerate and have me edit again, that's fine
*Thread Reply:* I'll see how it looks without timing.
*Thread Reply:* I think google can do the autosync - see if it works?
*Thread Reply:* Yeah, I know that is an option when I upload to YouTube---a file without timing.
The workbook Conor is talking about is also available at this URL: https://www.colorado.edu/ibg/sites/default/files/attached-files/b2021_ctd_p1.pdf
Can you share a screenshot of what it’s stuck on?
*Thread Reply:* It can usually be fixed by deleting ~/.local/share/rstudio
*Thread Reply:* @Laura can you try this? You can ssh into the cluster and rm ~/.local/share/rstudio
*Thread Reply:* that should be rm -rf ~/.local/share/rstudio
, but I already ran that for you
*Thread Reply:* thank you - it is working now! now I’m just struggling with the working directory / loading in the data file (using a Mac)
> setwd("~/\\home\\conor\\Boulder2021")
> dataTw=read.table(file='dataTw.dat', header=T)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'dataTw.dat': No such file or directory
*Thread Reply:* Ah, the slashes are not right. it should be [edited to fix typo]
setwd("~conor/Boulder2021")
*Thread Reply:* setwd("~/conor/Boulder2021")
Error in setwd("~/conor/Boulder2021") : cannot change working directory
*Thread Reply:* Or /faculty/conor
or /home/conor
or ~conor
because those all refer to the same exact place.
*Thread Reply:* Oops, my mistake setwd("~conor/Boulder2021")
without the slash after the tilde
*Thread Reply:* how to save script? when I select my username from the list (using ‘save as’ option) it says ‘no such file or directory’..
*Thread Reply:* You went to "save as" under the file menu, and then what is it showing you? Particularly, there should be a line that says "File name:" and then below that a line that says what directory you're in. What directory does it show you in?
*Thread Reply:* > / > home / conor / Boulder2021
*Thread Reply:* Right, the problem is you can't save to that location. Unfortunately it defaults to trying to save to the working directory.
You have a few options. You can click on "home" and then scroll through the long list and find your username and click to go into your directory. Quicker might be to click on the square with the ...
in it on the right side of that row (or the one above), and just put a ~
into the dialog box, then select ok. That will send it to your home directory. Once the main box is showing your home directory, then you can navigate to whatever folder you want, and put in a name for the file to save, etc.
*Thread Reply:* That works, thank you! Am I saving somewhere remotely by doing that? If so, will I still be able to access the scripts after the workshop has finished?
*Thread Reply:* That is saving it on the workshop server, which is "local" to Rstudio, because it is running on the workshop server. I'll make all of the files in your account available for you to download after the workshop. You can also download them during the course with an sftp or scp program, or using the file browser in the Rstudio. In Rstudio, I think it calls it "Export".
No need to spend too much time on downloading things, as I'll make it all available as a single zip file after the course.
*Thread Reply:* thanks so much, that’s really helpful
*Thread Reply:* I have successfully ‘saved as’ but it is not letting me ‘save’… should I have faith that it is being saved on the server even though it is not updating at my end?
*Thread Reply:* What is your username?
*Thread Reply:* I see you have a R/day1.R
which was written to at 28 past the hour, so it has not been saved since then.
*Thread Reply:* that’s the one, saved earlier, but obviously not letting me save currently.. any ideas why?
*Thread Reply:* I'm not sure. It's hard to know without seeing an error, or what it looks like.
*Thread Reply:* ok, thanks. There’s no error, the script file name just ‘stays red’ and doesn’t save. I’ll copy paste the script into text edit and try and save it again in R tomorrow…
*Thread Reply:* Try hitting reload or killing the tab and going back to https://workshop.colorado.edu/rstudio
data reading code: dataTw=read.table(file="/home/conor/Boulder2021/dataTw.dat", header=T)
Hi there 🙂 Thanks for a great introduction today. I have a question about Q1.4 in the workbook: Is the value of the variance component interpretable?
*Thread Reply:* Hi Anna, see this thread from the <#C01TLU7JAQK|anonymous-question-box> channel: https://boulder-workshop.slack.com/archives/C01TLU7JAQK/p1623090550008800
*Thread Reply: Hi Anna: the R^2 in q. 1.4 is .02732 and the explained variance is .02732*15.10257 = 0.412. The .412 is interpretable as additive genetic variance. So the total phenotypic variance is 15.10257, of this 0.412 is "explained" by the QTL (additively coded). The QTL (additively coded contribute .412 to the phenotypic variance.
Hi all! One student had trouble viewing the PDF of my “slides” (a long scrolling document), finding they were too tiny or “zoomed out” too read. Try downloading the file and opening it separately from the browser; they should be high resolution, it’s just a question of zooming in! These are linked on the relevant syllabus page, along with another document containing references.
Question about the second paragraph at the top of page 6 in the practical - it says: "The proportion of explained variance are 0.02732 (additive) and 0.03658 (additive + dominance). Because the predictors are uncorrelated, and given the phenotypic variance of 15.102 (print(s2ph)), we have the following variance components:"
How can the predictors be uncorrelated? Only heterozygotes (who get coded as '1' on additive) can have dominance coded as 1. So there are conditional probabilities that come out as 0 or 1, eg. P(T1QTLA1=0|T1QTLD1=1) = 0 and P(T1QTLA1=1|T1QTLD1=1) = 1, etc.
But perhaps I'm confusing the definitions for "independent" and "uncorrelated"... can you have the latter without having the former?
Yeah, sounds like I was confusing the definitions - here's a nice article on "independent" vs. "uncorrelated" and why you can have the latter without the former: https://www.themathcitadel.com/uncorrelated-and-independent-related-but-not-equivalent/
*Thread Reply:* Great question Kristen. The additive and dominance coding of a QTL or genotype form an excellent example of a pair of random variables that are uncorrelated but not independent. In fact, the example from the article you posted is essentially the same example for the case of allele frequency p = 1/2!
*Thread Reply:* In general, independent implies uncorrelated but not vice versa. As a further point, for more than two random variables, you can have pairwise independence without full mutual independence, while correlation is always just a pairwise thing.
*Thread Reply:* Finally, one special but important setting where independent is equivalent to uncorrelated is for jointly Gaussian random variables, i.e. components of a multivariate normal distribution.
*Thread Reply:* The property "uncorrelated implies independent" uniquely characterizes the multivariate Gaussian only within the set of continuous multivariate distributions. It's actually rather straightforward to construct multivariate discrete distributions with that property. Shameless plug--I've published research involving such discrete distributions: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4752908/
*Thread Reply:* Cool! (Edited from "the one" to "one" above 🙂)
*Thread Reply:* It’s even true for a pair of Bernoullis. It’s false for a triple, however: if X and Y are independent Bernoulli(1/2) and Z = X + Y mod 2 = X xor Y, then X, Y, Z are pairwise independent but not jointly independent. Do you know of discrete examples beyond the bivariate case, i.e. discrete k-variate distributions with k > 2 for which pairwise uncorrelated implies jointly independent? (I wasn’t sure if the latent-variable reduction construction in your paper gets you there.)
*Thread Reply:* Have a look at the Supplemental Appendices for that paper.
*Thread Reply:* I see — trivariate Poisson fits the bill, with an assist from non-negativity. Thanks!
*Thread Reply:* Yes, it's true that latent-variate reduction doesn't allow for negative correlation.
*Thread Reply:* It might be true that the multivariate Gaussian is the unique distribution for which (1) uncorrelated implies independent, (2) pairwise correlations may be of either sign and of any (non-degenerate) magnitude, and (3) the former 2 properties apply for arbitrarily many dimensions.
*Thread Reply:* And in this case e.g. theta0 + theta12 = 0 implies both are zero, as they must be non-negative — which allows you that extra degree of freedom without sacrificing the property.
*Thread Reply:* Yes, for (2) the only restriction is that the covariance matrix must be non-negative definite (as any covariance matrix must be).
*Thread Reply:* I bring up dimensionality because I am aware of a technique for constructing bivariate discrete distributions that allows for negative correlation, but I am not aware of any generalization to more than 2 dimensions. As cited in my paper, see Lakshminarayana et al. 1999; Kocherlakota and Kocherlakota 2001; Famoye 2010.
*Thread Reply:* Yes, you're correct about the restriction on the covariance matrix; I was hinting at that with my vague phrase "non-degenerate" 🙂
@Jeff Lessem (he/him) has renamed the channel from "intro-genomics-biometric-model" to "day01-intro-genomics-biometric-model"
Hi, I was wondering if anyone knew why when using an openmx script that it would give !!! and not be able to form confidence intervals around raw estimates, but would be able to for standardized
*Thread Reply:* Do you have a lower or upper bound on the raw estimate? If the confidence interval would go below or above a stated boundary, you could get that !!! flag. If you run summary on your model with verbose=T, the confidence interval diagnostics should print and that sometimes has more helpful information. I think there's a diagnostic specifically about boundaries. There's also an alpha level not reached for iterations, I've noticed sometimes if I ask for many CIs in one model, the iteration limit gets hit before all can finish, so sometimes also asking for fewer CIs per model can help or upping the maximum iteration limit (though I can't remember off the top of my head if changing iterations is an optimizer-specific option)
Hi! I’ve been going through the material again and I hope it’s ok to still ask a question that is still puzzling me. About the part 1 of the practical and the coding of dominance for the QTLD1 to _D10 - the genotypes aa, Aa or aA, and AA were coded as: 0 (aa), 1 (Aa or aA) and 0 (AA). I don’t completely follow why we code them like this as intuitively it would make more sense to me that Aa or aA or AA were coded with the same value because they should have a more similar effect on the phenotype (the means would be more close together). Are we coding them like this so we can ensure the dominance and additive effects are not correlated? Thank you in advance!!
*Thread Reply:* That is a great question! Coding as 0 1 0 does not actually ensure that there is orthogonality with the additive component [if you do https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process|gram-schmidt you end dealing with alelle frequency and the coding - this paper has the orthonormalization in it]
*Thread Reply:* it is more a convention about considering this to be the dominance deviation
*Thread Reply:* for the heterozygote compared to where it would be expected compared to fitting the additive model