Inverse-variance weighted method — mr_ivw • MendelianRandomization

The mr_ivw function implements the inverse-variance method, informally known as the "Toby Johnson" method. With a single genetic variant, this is simply the ratio method.

Usage

mr_ivw(
  object,
  model = "default",
  robust = FALSE,
  penalized = FALSE,
  weights = "simple",
  psi = 0,
  correl = FALSE,
  distribution = "normal",
  alpha = 0.05,
  ...
)

# S4 method for MRInput
mr_ivw(
  object,
  model = "default",
  robust = FALSE,
  penalized = FALSE,
  weights = "simple",
  psi = 0,
  correl = FALSE,
  distribution = "normal",
  alpha = 0.05,
  ...
)

Arguments

object: An MRInput object.
model: What type of model should be used: "default", "random" or "fixed". The random-effects model ("random") is a multiplicative random-effects model, allowing overdispersion in the weighted linear regression (the residual standard error is not fixed to be 1, but is not allowed to take values below 1). The fixed-effect model ("fixed") sets the residual standard error to be 1. The "default" setting is to use a fixed-effect model with 3 genetic variants or fewer, and otherwise to use a random-effects model.
robust: Indicates whether robust regression using the lmrob() function from the package robustbase should be used in the method rather than standard linear regression (lm).
penalized: Indicates whether a penalty should be applied to the weights to downweight the contribution of genetic variants with outlying ratio estimates to the analysis.
weights: Which weights to use in the weighted regression. If "simple" (the default option), then the IVW estimate is equivalent to meta-analysing the ratio estimates from each variant using inverse-variance weights based on the simplest expression of the variance for the ratio estimate (first-order term from the delta expansion - standard error of the association with the outcome divided by the association with the exposure). If "delta", then the variance expression is the second-order term from the delta expansion. The second-order term incorporates uncertainty in the genetic association with the exposure -- this uncertainty is ignored using the simple weighting.
psi: The correlation between the genetic associations with the exposure and the association with the outcome for each variant resulting from sample overlap. The default value is 0, corresponding to a strict two-sample Mendelian randomization analysis (no overlap). If there is complete overlap between the samples, then the correlation should be set to the observational correlation between the exposure and the outcome. This correlation is only used in the calculation of standard errors if the option weights is set to "delta".
correl: If the genetic variants are correlated, then this correlation can be accounted for. The matrix of correlations between must be provided in the MRInput object: the elements of this matrix are the correlations between the individual variants (diagonal elements are 1). If a correlation matrix is specified in the MRInput object, then correl is set to TRUE. If correl is set to TRUE, then the values of robust and penalized are taken as FALSE, and weights is set to "simple".
distribution: The type of distribution used to calculate the confidence intervals. Options are "normal" (default) or "t-dist".
alpha: The significance level used to calculate the confidence interval. The default value is 0.05.
...: Additional arguments to be passed to the regression method.

Value

The output from the function is an IVW object containing:

Model: A character string giving the type of model used ("fixed", "random", or "default").
Exposure: A character string giving the name given to the exposure.
Outcome: A character string giving the name given to the outcome.
Correlation: The matrix of genetic correlations.
Robust: TRUE if robust regression has been used to calculate the estimate, FALSE otherwise.
Penalized: TRUE if weights have been penalized, FALSE otherwise.
Estimate: The value of the causal estimate.
StdError: Standard error of the causal estimate.
CILower: The lower bound of the causal estimate based on the estimated standard error and the significance level provided.
CIUpper: The upper bound of the causal estimate based on the estimated standard error and the significance level provided.
Alpha: The significance level used when calculating the confidence intervals.
Pvalue: The p-value associated with the estimate (calculated as Estimate/StdError as per Wald test) using a normal or t-distribution (as specified in distribution).
SNPs: The number of genetic variants (SNPs) included in the analysis.
RSE: The estimated residual standard error from the regression model.
Heter.Stat: Heterogeneity statistic (Cochran's Q statistic) and associated p-value: the null hypothesis is that all genetic variants estimate the same causal parameter; rejection of the null is an indication that one or more variants may be pleiotropic.
Fstat: An approximation of the first-stage F statistic for all variants based on the summarized data.

Details

With multiple uncorrelated genetic variants, this estimate can be thought of as: 1) the inverse-variance weighted combination of the ratio estimates from a meta-analysis; 2) the ratio estimate from combining the genetic variants into a weighted score and then using this score as an instrumental variable (the same estimate is obtained from the two-stage least squares method using individual-level data); 3) the coefficient from weighted linear regression of the associations with the outcome on the associations with the risk factor fixing the intercept to zero and using the inverse-variance weights.

Here, we implement the method using weighted linear regression. If the variants are correlated, the method is implemented using generalized weighted linear regression; this is hard coded using matrix algebra.

The causal estimate is obtained by regression of the associations with the outcome on the associations with the risk factor, with the intercept set to zero and weights being the inverse-variances of the associations with the outcome.

With a single genetic variant, the estimate is the ratio of coefficients betaY/betaX and the standard error is the first term of the delta method approximation betaYse/betaX.

References

Original implementation: The International Consortium for Blood Pressure Genome-Wide Association Studies. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 2011; 478:103-109. doi: 10.1038/nature10405.

Detailed description of method: Stephen Burgess, Adam S Butterworth, Simon G Thompson. Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology 2013; 37:658-665. doi: 10.1002/gepi.21758.

Robust and penalized weights: Stephen Burgess, Jack Bowden, Frank Dudbridge, Simon G Thompson. Robust instrumental variable methods using multiple candidate instruments with application to Mendelian randomization. arXiv 2016; 1606.03729.

Heterogeneity test: Fabiola del Greco, Cosetta Minelli, Nuala A Sheehan, John R Thompson. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med 2015; 34(21):2926-2940. doi: 10.1002/sim.6522.

Simple versus delta weights (first-order versus second-order): Stephen Burgess, Jack Bowden. Integrating summarized data from multiple genetic variants in Mendelian randomization: bias and coverage properties of inverse-variance weighted methods. arXiv:1512.04486.

Examples

mr_ivw(mr_input(bx = ldlc, bxse = ldlcse, by = chdlodds, byse = chdloddsse))
#> 
#> Inverse-variance weighted method
#> (variants uncorrelated, random-effect model)
#> 
#> Number of Variants : 28 
#> 
#> ------------------------------------------------------------------
#>  Method Estimate Std Error 95% CI       p-value
#>     IVW    2.834     0.530 1.796, 3.873   0.000
#> ------------------------------------------------------------------
#> Residual standard error =  1.920 
#> Heterogeneity test statistic (Cochran's Q) = 99.5304 on 27 degrees of freedom, (p-value = 0.0000). I^2 = 72.9%. 
#> F statistic = 28.0. 
mr_ivw(mr_input(bx = ldlc, bxse = ldlcse, by = chdlodds, byse = chdloddsse),
  robust = TRUE)
#> 
#> Inverse-variance weighted method
#> (variants uncorrelated, random-effect model)
#> 
#> Number of Variants : 28 
#> Robust regression used.
#> ------------------------------------------------------------------
#>  Method Estimate Std Error 95% CI       p-value
#>     IVW    2.797     0.307 2.195, 3.399   0.000
#> ------------------------------------------------------------------
#> Residual standard error =  1.987 
#> Heterogeneity test statistic (Cochran's Q) = 106.5638 on 27 degrees of freedom, (p-value = 0.0000). I^2 = 74.7%. 
#> F statistic = 28.0. 
mr_ivw(mr_input(bx = ldlc, bxse = ldlcse, by = chdlodds, byse = chdloddsse),
  penalized = TRUE)
#> 
#> Inverse-variance weighted method
#> (variants uncorrelated, random-effect model)
#> 
#> Number of Variants : 28 
#> Weights of genetic variants with heterogeneous causal estimates have been penalized. 
#> ------------------------------------------------------------------
#>  Method Estimate Std Error 95% CI       p-value
#>     IVW    2.561     0.413 1.752, 3.370   0.000
#> ------------------------------------------------------------------
#> Residual standard error =  1.400 
#> Heterogeneity is not calculated when weights are penalized, or when there is only one variant in the analysis.
#> F statistic = 28.0. 
mr_ivw(mr_input(calcium, calciumse, fastgluc, fastglucse, corr=calc.rho))
#> 
#> Inverse-variance weighted method
#> (variants correlated, random-effect model)
#> 
#> Number of Variants : 6 
#> 
#> ------------------------------------------------------------------
#>  Method Estimate Std Error 95% CI       p-value
#>     IVW    2.245     0.643 0.984, 3.505   0.000
#> ------------------------------------------------------------------
#> Residual standard error =  0.641 
#> Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
#> Heterogeneity test statistic (Cochran's Q) = 2.0530 on 5 degrees of freedom, (p-value = 0.8418). I^2 = 0.0%. 
#> F statistic = 11.5. 
#> 
#> (Estimates with correlated variants are sensitive to the signs in the correlation matrix
#>  - please ensure that your correlations are expressed with respect to the same effect alleles as your summarized association estimates.) 
  ## correlated variants