Purpose

The gnomesims package is designed to provide estimates of gene-environment correlation for simulated data. It is a tool for researchers who use polygenic scores of twins, parents and siblings to detect gene-environment correlation and want to address issues of power, sample size and effect size. We focus on two types of gene-environment correlation, namely, cultural transmission (= genetic nurture) and sibling interaction.

Setup

The package can be installed from its’ Github repository using devtools.

# install.packages("devtools")
library(devtools)
devtools::install_github("josefinabernardo/gnomesims")

Next, it should be loaded it into you current session.

library(gnomesims)

Running a simulation using OpenMx

The core function of gnomesims is the gnome_mx_simulation() function. It takes in the ACE estimates, sample sizes, and effect size measures as arguments and returns two data frames with power estimates and path coefficients.

gnome_mx_simulation(ct = .01, si = .025, npgsloci = 10)
## [1] "Running simulation proportion of genetic variance explained by the PGS is: 0.1 ."
## [1] "The factorial design has 1 setting(s)."
## [1] 1
## $power
##    nmz  ndz         a         c         e    g     b x PGS   A   p1   p2   p3
## 1 4000 4000 0.6324555 0.5477226 0.5477226 0.01 0.025 0 0.1 0.9 0.13 0.18 0.06
##      p4    p5    p6    p7    p8        Smz       Sdz
## 1 0.108 0.122 0.149 0.054 0.079 0.04559689 0.0297855
## 
## $params
##    nmz  ndz         a         c         e    g     b x PGS   A    e1   e2   e3
## 1 4000 4000 0.6324555 0.5477226 0.5477226 0.01 0.025 0 0.1 0.9 0.023 0.03 0.01
##      e4    e5    e6   e7    e8        Smz       Sdz
## 1 0.025 0.029 0.032 0.01 0.025 0.04559689 0.0297855

Embedded within the simulation is the function gnome_power() to calculate power from alpha, degrees of freedom and the non-centrality parameter.

We recognize that the path coefficients are difficult to interpret. To solve this issue, the function gnome_effect() calculates a readily interpretable effect size measure.

Running a simulation using generalized estimating equations

The package can also simulate results using generalized estimating equations (gee) with the gnome_gee_simulation() function. Functionality and results are similar to those of the gnome_mx_simulation() function.

gnome_gee_simulation(ct = .01, si = .025, npgsloci = 10)
## [1] "Running simulation proportion of genetic variance explained by the PGS is: 0.1 ."
## [1] "The factorial design has 1 setting(s)."
## [1] 1
## $power
##    nmz  ndz         a         c         e    g     b x PGS   A    p1    p2   p3
## 1 4000 4000 0.6324555 0.5477226 0.5477226 0.01 0.025 0 0.1 0.9 0.139 0.181 0.06
##     p4   p5   p6    p7   p8        Smz       Sdz
## 1 0.11 0.12 0.15 0.054 0.08 0.04559689 0.0297855
## 
## $params
##    nmz  ndz         a         c         e    g     b x PGS   A    e1    e2   e3
## 1 4000 4000 0.6324555 0.5477226 0.5477226 0.01 0.025 0 0.1 0.9 0.025 0.061 0.01
##     e4    e5    e6   e7   e8        Smz       Sdz
## 1 0.05 0.029 0.063 0.01 0.05 0.04559689 0.0297855

In-built data sets

To demonstrate what type of data this package can generate, we have included two in-built data sets. They contain the results of the gnome_mx_simulation() function for 3 x 3 = 9 combination of AC covariance input parameters. The data set gnome_power_data contains the power results and the data set gnome_params_data contains parameter estimates.

gnome_power_data
##    nmz  ndz    a    c    e   CT   SI x PGS   A CT(m1) MZDZ SI(m2) MZDZ
## 1 4000 4000 0.63 0.55 0.55 0.00 0.00 0 0.1 0.9       0.050       0.050
## 2 4000 4000 0.63 0.55 0.55 0.05 0.00 0 0.1 0.9       0.408       0.161
## 3 4000 4000 0.63 0.55 0.55 0.10 0.00 0 0.1 0.9       0.917       0.474
## 4 4000 4000 0.63 0.55 0.55 0.00 0.05 0 0.1 0.9       0.160       0.399
## 5 4000 4000 0.63 0.55 0.55 0.05 0.05 0 0.1 0.9       0.756       0.754
## 6 4000 4000 0.63 0.55 0.55 0.10 0.05 0 0.1 0.9       0.989       0.945
## 7 4000 4000 0.63 0.55 0.55 0.00 0.10 0 0.1 0.9       0.505       0.929
## 8 4000 4000 0.63 0.55 0.55 0.05 0.10 0 0.1 0.9       0.953       0.992
## 9 4000 4000 0.63 0.55 0.55 0.10 0.10 0 0.1 0.9       0.999       0.999
##   CT(m3) MZDZ SI(m3) MZDZ CT(m1) DZ SI(m2) DZ CT(m3) DZ SI(m3) DZ    Smz    Sdz
## 1       0.050       0.050     0.050     0.050     0.050     0.050  0.00%  0.00%
## 2       0.300       0.050     0.261     0.156     0.152     0.050  6.82%  6.82%
## 3       0.787       0.050     0.724     0.453     0.424     0.050 14.65% 14.65%
## 4       0.050       0.292     0.179     0.302     0.050     0.171  6.57%  3.41%
## 5       0.285       0.281     0.640     0.651     0.146     0.163 13.90% 10.74%
## 6       0.761       0.271     0.940     0.894     0.404     0.156 22.22% 19.06%
## 7       0.050       0.800     0.550     0.819     0.050     0.506 13.65%  7.32%
## 8       0.270       0.783     0.915     0.962     0.140     0.480 21.47% 15.15%
## 9       0.734       0.763     0.994     0.995     0.384     0.454 30.30% 23.97%

Assortative mating

It is possible to look at results for present but unmodelled assortative mating by specifying the genotypic correlation between the parents. We are working on a version that estimates direct assortative mating when fitting the model.

gnome_mx_simulation(ct = .01, si = .025, npgsloci = 10, assortm = .26)
## [1] "Running simulation proportion of genetic variance explained by the PGS is: 0.1 ."
## [1] "The factorial design has 1 setting(s)."
## [1] 1
## $power
##    nmz  ndz         a         c         e    g     b x PGS   A assortm    p1
## 1 4000 4000 0.6324555 0.5477226 0.5477226 0.01 0.025 0 0.1 0.9    0.26 0.141
##      p2   p3    p4    p5    p6    p7    p8        Smz       Sdz
## 1 0.189 0.06 0.106 0.132 0.157 0.054 0.078 0.04559689 0.0297855
## 
## $params
##    nmz  ndz         a         c         e    g     b x PGS   A assortm    e1
## 1 4000 4000 0.6324555 0.5477226 0.5477226 0.01 0.025 0 0.1 0.9    0.26 0.024
##      e2   e3    e4    e5    e6   e7    e8        Smz       Sdz
## 1 0.031 0.01 0.025 0.029 0.032 0.01 0.025 0.04559689 0.0297855