The mr_ivw
function implements the inverse-variance method, informally known as the "Toby Johnson" method. With a single
genetic variant, this is simply the ratio method.
Usage
mr_ivw(
object,
model = "default",
robust = FALSE,
penalized = FALSE,
weights = "simple",
psi = 0,
correl = FALSE,
distribution = "normal",
alpha = 0.05,
...
)
# S4 method for MRInput
mr_ivw(
object,
model = "default",
robust = FALSE,
penalized = FALSE,
weights = "simple",
psi = 0,
correl = FALSE,
distribution = "normal",
alpha = 0.05,
...
)
Arguments
- object
An
MRInput
object.- model
What type of model should be used:
"default"
,"random"
or"fixed"
. The random-effects model ("random"
) is a multiplicative random-effects model, allowing overdispersion in the weighted linear regression (the residual standard error is not fixed to be 1, but is not allowed to take values below 1). The fixed-effect model ("fixed"
) sets the residual standard error to be 1. The"default"
setting is to use a fixed-effect model with 3 genetic variants or fewer, and otherwise to use a random-effects model.- robust
Indicates whether robust regression using the
lmrob()
function from the packagerobustbase
should be used in the method rather than standard linear regression (lm
).- penalized
Indicates whether a penalty should be applied to the weights to downweight the contribution of genetic variants with outlying ratio estimates to the analysis.
- weights
Which weights to use in the weighted regression. If
"simple"
(the default option), then the IVW estimate is equivalent to meta-analysing the ratio estimates from each variant using inverse-variance weights based on the simplest expression of the variance for the ratio estimate (first-order term from the delta expansion - standard error of the association with the outcome divided by the association with the exposure). If"delta"
, then the variance expression is the second-order term from the delta expansion. The second-order term incorporates uncertainty in the genetic association with the exposure -- this uncertainty is ignored using the simple weighting.- psi
The correlation between the genetic associations with the exposure and the association with the outcome for each variant resulting from sample overlap. The default value is
0
, corresponding to a strict two-sample Mendelian randomization analysis (no overlap). If there is complete overlap between the samples, then the correlation should be set to the observational correlation between the exposure and the outcome. This correlation is only used in the calculation of standard errors if the optionweights
is set to"delta"
.- correl
If the genetic variants are correlated, then this correlation can be accounted for. The matrix of correlations between must be provided in the
MRInput
object: the elements of this matrix are the correlations between the individual variants (diagonal elements are 1). If a correlation matrix is specified in theMRInput
object, thencorrel
is set toTRUE
. Ifcorrel
is set toTRUE
, then the values ofrobust
andpenalized
are taken asFALSE
, andweights
is set to"simple"
.- distribution
The type of distribution used to calculate the confidence intervals. Options are
"normal"
(default) or"t-dist"
.- alpha
The significance level used to calculate the confidence interval. The default value is 0.05.
- ...
Additional arguments to be passed to the regression method.
Value
The output from the function is an IVW
object containing:
- Model
A character string giving the type of model used (
"fixed"
,"random"
, or"default"
).- Exposure
A character string giving the name given to the exposure.
- Outcome
A character string giving the name given to the outcome.
- Correlation
The matrix of genetic correlations.
- Robust
TRUE
if robust regression has been used to calculate the estimate,FALSE
otherwise.- Penalized
TRUE
if weights have been penalized,FALSE
otherwise.- Estimate
The value of the causal estimate.
- StdError
Standard error of the causal estimate.
- CILower
The lower bound of the causal estimate based on the estimated standard error and the significance level provided.
- CIUpper
The upper bound of the causal estimate based on the estimated standard error and the significance level provided.
- Alpha
The significance level used when calculating the confidence intervals.
- Pvalue
The p-value associated with the estimate (calculated as Estimate/StdError as per Wald test) using a normal or t-distribution (as specified in
distribution
).- SNPs
The number of genetic variants (SNPs) included in the analysis.
- RSE
The estimated residual standard error from the regression model.
- Heter.Stat
Heterogeneity statistic (Cochran's Q statistic) and associated p-value: the null hypothesis is that all genetic variants estimate the same causal parameter; rejection of the null is an indication that one or more variants may be pleiotropic.
- Fstat
An approximation of the first-stage F statistic for all variants based on the summarized data.
Details
With multiple uncorrelated genetic variants, this estimate can be thought of as: 1) the inverse-variance weighted combination of the ratio estimates from a meta-analysis; 2) the ratio estimate from combining the genetic variants into a weighted score and then using this score as an instrumental variable (the same estimate is obtained from the two-stage least squares method using individual-level data); 3) the coefficient from weighted linear regression of the associations with the outcome on the associations with the risk factor fixing the intercept to zero and using the inverse-variance weights.
Here, we implement the method using weighted linear regression. If the variants are correlated, the method is implemented using generalized weighted linear regression; this is hard coded using matrix algebra.
The causal estimate is obtained by regression of the associations with the outcome on the associations with the risk factor, with the intercept set to zero and weights being the inverse-variances of the associations with the outcome.
With a single genetic variant, the estimate is the ratio of coefficients betaY/betaX and the standard error is the first term of the delta method approximation betaYse/betaX.
References
Original implementation: The International Consortium for Blood Pressure Genome-Wide Association Studies. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 2011; 478:103-109. doi: 10.1038/nature10405.
Detailed description of method: Stephen Burgess, Adam S Butterworth, Simon G Thompson. Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology 2013; 37:658-665. doi: 10.1002/gepi.21758.
Robust and penalized weights: Stephen Burgess, Jack Bowden, Frank Dudbridge, Simon G Thompson. Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization. arXiv 2016; 1606.03729.
Heterogeneity test: Fabiola del Greco, Cosetta Minelli, Nuala A Sheehan, John R Thompson. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med 2015; 34(21):2926-2940. doi: 10.1002/sim.6522.
Simple versus delta weights (first-order versus second-order): Stephen Burgess, Jack Bowden. Integrating summarized data from multiple genetic variants in Mendelian randomization: bias and coverage properties of inverse-variance weighted methods. arXiv:1512.04486.
Examples
mr_ivw(mr_input(bx = ldlc, bxse = ldlcse, by = chdlodds, byse = chdloddsse))
#>
#> Inverse-variance weighted method
#> (variants uncorrelated, random-effect model)
#>
#> Number of Variants : 28
#>
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> IVW 2.834 0.530 1.796, 3.873 0.000
#> ------------------------------------------------------------------
#> Residual standard error = 1.920
#> Heterogeneity test statistic (Cochran's Q) = 99.5304 on 27 degrees of freedom, (p-value = 0.0000). I^2 = 72.9%.
#> F statistic = 28.0.
mr_ivw(mr_input(bx = ldlc, bxse = ldlcse, by = chdlodds, byse = chdloddsse),
robust = TRUE)
#>
#> Inverse-variance weighted method
#> (variants uncorrelated, random-effect model)
#>
#> Number of Variants : 28
#> Robust regression used.
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> IVW 2.797 0.307 2.195, 3.399 0.000
#> ------------------------------------------------------------------
#> Residual standard error = 1.987
#> Heterogeneity test statistic (Cochran's Q) = 106.5638 on 27 degrees of freedom, (p-value = 0.0000). I^2 = 74.7%.
#> F statistic = 28.0.
mr_ivw(mr_input(bx = ldlc, bxse = ldlcse, by = chdlodds, byse = chdloddsse),
penalized = TRUE)
#>
#> Inverse-variance weighted method
#> (variants uncorrelated, random-effect model)
#>
#> Number of Variants : 28
#> Weights of genetic variants with heterogeneous causal estimates have been penalized.
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> IVW 2.561 0.413 1.752, 3.370 0.000
#> ------------------------------------------------------------------
#> Residual standard error = 1.400
#> Heterogeneity is not calculated when weights are penalized, or when there is only one variant in the analysis.
#> F statistic = 28.0.
mr_ivw(mr_input(calcium, calciumse, fastgluc, fastglucse, corr=calc.rho))
#>
#> Inverse-variance weighted method
#> (variants correlated, random-effect model)
#>
#> Number of Variants : 6
#>
#> ------------------------------------------------------------------
#> Method Estimate Std Error 95% CI p-value
#> IVW 2.245 0.643 0.984, 3.505 0.000
#> ------------------------------------------------------------------
#> Residual standard error = 0.641
#> Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
#> Heterogeneity test statistic (Cochran's Q) = 2.0530 on 5 degrees of freedom, (p-value = 0.8418). I^2 = 0.0%.
#> F statistic = 11.5.
#>
#> (Estimates with correlated variants are sensitive to the signs in the correlation matrix
#> - please ensure that your correlations are expressed with respect to the same effect alleles as your summarized association estimates.)
## correlated variants