Multivariable MR-Lasso method — mr_mvlasso • MendelianRandomization

The mr_mvlasso function performs the multivariable MR-Lasso method, which applies lasso-type penalization to the direct effects of genetic variants on the outcome. The causal estimates are described as post-lasso estimates, and are obtained by performing the multivariable IVW method using only those genetic variants that are identified as valid by the lasso procedure.

Usage

mr_mvlasso(
  object,
  orientate = 1,
  distribution = "normal",
  alpha = 0.05,
  lambda = numeric(0)
)

# S4 method for MRMVInput
mr_mvlasso(
  object,
  orientate = 1,
  distribution = "normal",
  alpha = 0.05,
  lambda = numeric(0)
)

Arguments

object: An MRMVInput object.
orientate: The risk factor that genetic associations are orientated to. The default option is 1, meaning that genetic associations with the first risk factor are set to be positive.
distribution: The type of distribution used to calculate the confidence intervals. Options are "normal" (default) or "t-dist".
alpha: The significance level used to calculate the confidence intervals. The default value is 0.05.
lambda: The value of the tuning parameter used by the lasso procedure which controls the level of sparsity. If not specified, the tuning parameter will be calculated by the heterogeneity stopping rule.

Value

The output from the function is an MVLasso object containing:

Exposure: A character vector with the names given to the exposure.
Outcome: A character string with the names given to the outcome.
Estimate: A vector of causal estimates from the multivariable MR-Lasso method. These are the post-lasso estimates.
StdError: A vector of standard errors of the causal estimates from the multivariable MR-Lasso method.
CILower: The lower bounds of the confidence intervals for the causal estimates based on the estimated standard errors and the significance level provided.
CIUpper: The upper bounds of the confidence intervals for the causal estimates based on the estimated standard errors and the significance level provided.
Alpha: The significance level used when calculating the confidence intervals.
Pvalue: The p-values associated with the (post-lasso) causal estimates using a normal or t-distribution (as specified in distribution).
SNPs: The number of genetic variants (SNPs) included in the analysis.
RegEstimate: The estimates from the regularized regression model used in the multivariable MR-Lasso method.
RegIntercept: The intercept estimates estimates from the regularized regression model used in the multivariable MR-Lasso method.
Valid: The number of genetic variants that have been identified as valid instruments.
ValidSNPs: The names of genetic variants that have been identified as valid instruments.
Lambda: The value of the tuning parameter used to compute RegEstimate.

Details

Multivariable MR-Lasso extends the multivariable IVW model to include an intercept term for each genetic variant. These intercept terms represent associations between the genetic variants and the outcome which bypass the risk factors. The regularized regression model is estimated by multivariable weighted linear regression where the intercept terms are subject to lasso-type penalization. The lasso penalization will tend to shrink the intercept terms corresponding to the valid instruments to zero.

The lasso penalty relies on a tuning parameter which controls the level of sparsity. The default is to use a heterogeneity stopping rule, but a fixed value may be specified.

As part of the analysis, the genetic variants are orientated so that all of the associations with one of the risk factors are positive (the first risk factor is used by default). Re-orientation of the genetic variants is performed automatically as part of the function.

The MR-Lasso method is performed in two steps. First, a regularized regression model is fitted, and some genetic variants are identified as valid instruments. Second, causal effects are estimated using standard multivariable IVW with only the valid genetic variants. The post-lasso method will be performed as long as the number of genetic variants identified as valid instruments is greater than the number of risk factors. The default heterogeneity stopping rule will always return more genetic variants as valid instruments than risk factors for identification. The main estimates given by the method are the post-lasso estimates. However, parameter estimates from the regularized regression model used to identify invalid variants are also provided for completeness.

If a substantial proportion of genetic variants are removed from the analysis, the multivariable MR-Lasso method may give a false impression of confidence in the causal estimate due to homogeneity of the variant-specific causal estimates amongst the remaining variants. However, it is not reasonable to claim that there is strong evidence for a causal effect after a large number of variants with heterogeneous estimates have been removed from the analysis.

References

Andrew J Grant, Stephen Burgess. Pleiotropy robust methods for multivariable Mendelian randomization. arXiv 2020; 2008.11997

Examples

mr_mvlasso(mr_mvinput(bx = cbind(ldlc, hdlc, trig), bxse = cbind(ldlcse, hdlcse, trigse),
   by = chdlodds, byse = chdloddsse))
#> 
#> Multivariable MR-Lasso method 
#> 
#> Orientated to exposure : 1 
#> Number of variants : 28 
#> Number of valid instruments : 25 
#> Tuning parameter : 0.4641104 
#> ------------------------------------------------------------------
#>    Exposure Estimate Std Error  95% CI       p-value
#>  exposure_1    1.726     0.383  0.976, 2.477   0.000
#>  exposure_2   -0.126     0.441 -0.991, 0.738   0.775
#>  exposure_3    0.858     0.185  0.496, 1.220   0.000
#> ------------------------------------------------------------------