The mr_maxlik
function implements the maximum-likelihood method introduced by Burgess et al (2013).
Usage
mr_maxlik(
object,
model = "default",
correl = FALSE,
psi = 0,
distribution = "normal",
alpha = 0.05,
...
)
# S4 method for MRInput
mr_maxlik(
object,
model = "default",
correl = FALSE,
psi = 0,
distribution = "normal",
alpha = 0.05,
...
)
Arguments
- object
An
MRInput
object.- model
What type of model should be used:
"default"
,"random"
or"fixed"
. The method naturally estimates a fixed-effect model, assuming that the same causal effect is estimated by each of the genetic variants. However, if there is heterogeneity in the causal estimates of the different variants, then confidence intervals under a fixed-effect model will be overly narrow. The random-effects model adds additional uncertainty by multiplying the standard error by the square-root of the likelihood ratio heterogeneity statistic divided by the number of genetic variants less one (unless this quantity is less than 1, in which case no modification to the standard error is made). This parallels the residual standard error in a regression model (the Cochran Q heterogeneity test statistic is equal to the square of the RSE multiplied by the number of genetic variants less one). The default setting ("default"
) is to use a fixed-effect model with 3 genetic variants or fewer, and otherwise to use a random-effects model.- correl
If the genetic variants are correlated, then this correlation can be accounted for. The matrix of correlations between must be provided in the
MRInput
object: the elements of this matrix are the correlations between the individual variants (diagonal elements are 1).- psi
The correlation between the association with the exposure and the association with the outcome for each variant resulting from sample overlap.
- distribution
The type of distribution used to calculate the confidence intervals, can be
"normal"
(the default option) or"t-dist"
.- alpha
The significance level used to calculate the confidence interval. The default value is 0.05.
- ...
Additional arguments to be passed to the optimization method.
Value
The output from the function is an MaxLik
object containing:
- Model
A character string giving the type of model used (
"fixed"
,"random"
, or"default"
).- Exposure
A character string giving the name given to the exposure.
- Outcome
A character string giving the name given to the outcome.
- Correlation
The matrix of genetic correlations.
- Psi
The correlation between genetic associations with the exposure and with the outcome.
- Estimate
The value of the causal estimate.
- StdError
Standard error of the causal estimate.
- CILower
The lower bound of the causal estimate based on the estimated standard error and the significance level provided.
- CIUpper
The upper bound of the causal estimate based on the estimated standard error and the significance level provided.
- Alpha
The significance level used when calculating the confidence intervals.
- Pvalue
The p-value associated with the estimate (calculated as Estimate/StdError as per Wald test) using a normal or t-distribution (as specified in
distribution
).- SNPs
The number of genetic variants (SNPs) included in the analysis.
- RSE
The estimated residual standard error from the regression model (always equal to 1, as a fixed-effect model is required.
- Heter.Stat
Heterogeneity statistic (likelihood ratio statistic) and associated p-value: the null hypothesis is that all genetic variants estimate the same causal parameter; rejection of the null is an indication that one or more variants may be pleiotropic.
Details
A likelihood function is defined by assuming that the summarized data for each genetic variant are normally distributed. A bivariate normal distribution is assumed for the associations of each genetic variant with the exposure and with the outcome. The mean of the association with the outcome is taken as the mean association with the exposure multiplied by the causal effect parameter.
Thus, if there are K
genetic variants, then K+1
parameters are estimated by the method: one for each gene--exposure association, plus the causal parameter. If the number of genetic variants is large, then maximization of this function may be an issue. If the maximum likelihood estimate substantially differs from the inverse-variance weighted estimate, this may indicate that convergence has not occurred in the optimization algorithm.
The variance-covariance matrices for the bivariate normal distributions are obtained from the standard error estimates provided. The correlation psi
between genetic associations with the exposure and with the outcome due to sample overlap can be specified; its default value is zero.
Two features why this method may be preferred over the inverse-variance weighted method are the incorporation in the model of uncertainty in the genetic associations with the exposure, and of correlation between the genetic association estimates with exposure and outcome for each variant. The method is implemented both for uncorrelated and correlated genetic variants. It can also be used for a single genetic variant.
The original version of the maximum-likelihood method assumed that all genetic variants identify the same causal estimate; a fixed-effect model. The causal estimate may be overly precise if the fixed-effect model is incorrect and there is substantial heterogeneity in the causal estimates from the different variants. The random-effects analysis implemented here is an ad hoc solution to the problem of heterogeneity, but one that should result in reasonable confidence intervals that incorporate this heterogeneity.
References
Stephen Burgess, Adam S Butterworth, Simon G Thompson. Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology 2013; 37:658-665. doi: 10.1002/gepi.21758.
Examples
mr_maxlik(mr_input(bx = ldlc[1:10], bxse = ldlcse[1:10],
by = chdlodds[1:10], byse = chdloddsse[1:10]))
#>
#> Maximum-likelihood method
#> (variants uncorrelated, random-effect model)
#>
#> Number of Variants : 10
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> MaxLik 2.989 0.328 2.346, 3.631 0.000
#> ------------------------------------------------------------------
#> Residual standard error = 0.921
#> Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
#> Heterogeneity test statistic = 7.6348 on 9 degrees of freedom, (p-value = 0.5713)
mr_maxlik(mr_input(bx = ldlc[1:10], bxse = ldlcse[1:10],
by = chdlodds[1:10], byse = chdloddsse[1:10]), psi=0.2)
#>
#> Maximum-likelihood method
#> (variants uncorrelated, random-effect model)
#>
#> Number of Variants : 10
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> MaxLik 2.950 0.305 2.353, 3.547 0.000
#> ------------------------------------------------------------------
#> Residual standard error = 0.989
#> Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
#> Heterogeneity test statistic = 8.7982 on 9 degrees of freedom, (p-value = 0.4561)
mr_maxlik(mr_input(calcium, calciumse, fastgluc, fastglucse, corr=calc.rho))
#>
#> Maximum-likelihood method
#> (variants correlated, random-effect model)
#>
#> Number of Variants : 6
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> MaxLik 2.303 0.709 0.914, 3.692 0.001
#> ------------------------------------------------------------------
#> Residual standard error = 0.588
#> Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
#> Heterogeneity test statistic = 1.7277 on 5 degrees of freedom, (p-value = 0.8854)
## correlated variants