| Title: | Joint Sparse Regression & Network Learning with Missing Data |
|---|---|
| Description: | Simultaneously estimates sparse regression coefficients and response network structure in multivariate models with missing data. Unlike traditional approaches requiring imputation, handles missingness natively through unbiased estimating equations (MCAR/MAR compatible). Employs dual L1 regularization with automated selection via cross-validation or information criteria. Includes parallel computation, warm starts, adaptive grids, publication-ready visualizations, and prediction methods. Ideal for genomics, neuroimaging, and multi-trait studies with incomplete high-dimensional outcomes. See Zeng et al. (2025) <doi:10.48550/arXiv.2507.05990>. |
| Authors: | Yixiao Zeng [aut, cre, cph], Celia Greenwood [ths, aut] |
| Maintainer: | Yixiao Zeng <[email protected]> |
| License: | GPL-2 |
| Version: | 1.5.1 |
| Built: | 2026-05-07 08:10:08 UTC |
| Source: | https://github.com/yixiao-zeng/missonet |
missoNet fits a joint multivariate regression and conditional
dependency (precision–matrix) model when some response entries are missing.
The method estimates a sparse coefficient matrix linking
predictors to multivariate responses , together with a sparse
inverse covariance for the residuals in
, .
Responses may contain missing values (e.g., MCAR/MAR); predictors must be
finite. The package provides cross-validation, prediction, publication-ready
plotting, and simple simulation utilities.
Key features
Joint estimation of (regression) and (conditional network).
-regularization on both and with user-controlled grids.
K-fold cross-validation with optional 1-SE model selections.
Heatmap and 3D surface visualizations for CV error or GoF across
.
Fast prediction for new data using stored solutions.
Lightweight data generator for simulation studies.
Workflow
Fit a model across a grid of penalties with missoNet or
select penalties via cv.missoNet.
Visualize the CV error/GoF surface with plot.missoNet.
Predict responses for new observations with predict.missoNet.
missoNetFit models over user-specified penalty grids for
and ; returns estimated
, , , and metadata (grids, GoF).
cv.missoNetPerform k-fold cross-validation over a penalty grid;
stores est.min and (optionally) est.1se.beta,
est.1se.theta.
plot.missoNetS3 plotting method; heatmap or 3D scatter of CV error or GoF.
predict.missoNetS3 prediction method; returns
for a chosen solution.
generateDataGenerate synthetic datasets with controllable dimensions, signal, and missingness mechanisms for benchmarking.
GPL-2.
Maintainer: Yixiao Zeng [email protected] [copyright holder]
Authors:
Celia Greenwood [email protected] [thesis advisor]
missoNet, cv.missoNet, plot.missoNet,
predict.missoNet, generateData,
and browseVignettes("missoNet") for tutorials.
sim <- generateData(n = 100, p = 8, q = 5, rho = 0.1, missing.type = "MCAR") fit <- missoNet(X = sim$X, Y = sim$Z) # fit over a grid plot(fit) # GoF heatmap cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5, compute.1se = TRUE) plot(cvfit, type = "scatter", plt.surf = TRUE) # CV error surface yhat <- predict(cvfit, newx = sim$X, s = "lambda.min")sim <- generateData(n = 100, p = 8, q = 5, rho = 0.1, missing.type = "MCAR") fit <- missoNet(X = sim$X, Y = sim$Z) # fit over a grid plot(fit) # GoF heatmap cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, kfold = 5, compute.1se = TRUE) plot(cvfit, type = "scatter", plt.surf = TRUE) # CV error surface yhat <- predict(cvfit, newx = sim$X, s = "lambda.min")
Perform -fold cross-validation to select the regularization pair
(lambda.beta, lambda.theta) for missoNet. For each
fold the model is trained on partitions and evaluated on the held-out
partition over a grid of lambda pairs; the pair with minimum mean CV error is
returned, with optional 1-SE models for more regularized solutions.
cv.missoNet( X, Y, kfold = 5, rho = NULL, lambda.beta = NULL, lambda.theta = NULL, lambda.beta.min.ratio = NULL, lambda.theta.min.ratio = NULL, n.lambda.beta = NULL, n.lambda.theta = NULL, beta.pen.factor = NULL, theta.pen.factor = NULL, penalize.diagonal = NULL, beta.max.iter = 10000, beta.tol = 1e-05, theta.max.iter = 10000, theta.tol = 1e-05, eta = 0.8, eps = 1e-08, standardize = TRUE, standardize.response = TRUE, compute.1se = TRUE, relax.net = FALSE, adaptive.search = FALSE, shuffle = TRUE, seed = NULL, parallel = FALSE, cl = NULL, verbose = 1 )cv.missoNet( X, Y, kfold = 5, rho = NULL, lambda.beta = NULL, lambda.theta = NULL, lambda.beta.min.ratio = NULL, lambda.theta.min.ratio = NULL, n.lambda.beta = NULL, n.lambda.theta = NULL, beta.pen.factor = NULL, theta.pen.factor = NULL, penalize.diagonal = NULL, beta.max.iter = 10000, beta.tol = 1e-05, theta.max.iter = 10000, theta.tol = 1e-05, eta = 0.8, eps = 1e-08, standardize = TRUE, standardize.response = TRUE, compute.1se = TRUE, relax.net = FALSE, adaptive.search = FALSE, shuffle = TRUE, seed = NULL, parallel = FALSE, cl = NULL, verbose = 1 )
X |
Numeric matrix ( |
Y |
Numeric matrix ( |
kfold |
Integer |
rho |
Optional numeric vector of length |
lambda.beta, lambda.theta
|
Optional numeric vectors. Candidate
regularization paths for |
lambda.beta.min.ratio, lambda.theta.min.ratio
|
Optional numerics in |
n.lambda.beta, n.lambda.theta
|
Optional integers. Lengths of the
automatically generated lambda paths (ignored if the corresponding
|
beta.pen.factor |
Optional |
theta.pen.factor |
Optional |
penalize.diagonal |
Logical or |
beta.max.iter, theta.max.iter
|
Integers. Max iterations for the
|
beta.tol, theta.tol
|
Numerics |
eta |
Numeric in |
eps |
Numeric in |
standardize |
Logical. Standardize columns of |
standardize.response |
Logical. Standardize columns of |
compute.1se |
Logical. Also compute 1-SE solutions? Default |
relax.net |
(Experimental) Logical. If |
adaptive.search |
(Experimental) Logical. Use adaptive two-stage lambda search? Default |
shuffle |
Logical. Randomly shuffle fold assignments? Default |
seed |
Optional integer seed (used when |
parallel |
Logical. Evaluate folds in parallel using a provided cluster?
Default |
cl |
Optional cluster from |
verbose |
Integer in |
Internally, predictors X and responses Y can be standardized
for optimization; all reported estimates are re-scaled back to the original
data scale. Missingness in Y is handled via unbiased estimating
equations using column-wise observation probabilities estimated from Y
(or supplied via rho). This is appropriate when the missingness of each
response is independent of its unobserved value (e.g., MCAR).
If adaptive.search = TRUE, a fast two-stage pre-optimization narrows
the lambda grid before computing fold errors on a focused neighborhood; this
can be substantially faster on large grids but may occasionally miss the global
optimum.
When compute.1se = TRUE, two additional solutions are reported:
the largest lambda.beta and the largest lambda.theta whose CV
error is within one standard error of the minimum (holding the other lambda
fixed at its optimal value). At the end, three special lambda pairs are identified:
lambda.min: Parameters giving minimum CV error
lambda.1se.beta: Largest within 1 SE of minimum
(with fixed at optimum)
lambda.1se.theta: Largest within 1 SE of minimum
(with fixed at optimum)
The 1SE rules provide more regularized models that may generalize better.
A list of class "missoNet" with components:
List of estimates at the CV minimum:
Beta (), Theta (),
intercept mu (length ), lambda.beta, lambda.theta,
lambda.beta.idx, lambda.theta.idx, and (if requested)
relax.net.
List of estimates at the 1-SE lambda.beta
(if compute.1se = TRUE); NULL otherwise.
List of estimates at the 1-SE lambda.theta
(if compute.1se = TRUE); NULL otherwise.
Length- vector of working missingness probabilities.
Number of folds used.
Integer vector of length giving fold assignments
(names are "fold-k").
Unique lambda values explored along
the grid for and .
Logical indicating whether the diagonal of
was penalized.
Penalty factor matrices actually used.
List with CV diagnostics:
n, p, q, standardize, standardize.response,
mean errors cv.errors.mean, bounds cv.errors.upper/lower,
and the evaluated grids cv.grid.beta, cv.grid.theta (length equals
number of fitted models).
Yixiao Zeng [email protected], Celia M. T. Greenwood
Zeng, Y., et al. (2025). Multivariate regression with missing response data for modelling regional DNA methylation QTLs. arXiv:2507.05990.
missoNet for model fitting;
generic methods such as plot() and predict() for objects of class
"missoNet".
sim <- generateData(n = 120, p = 12, q = 6, rho = 0.1) X <- sim$X; Y <- sim$Z # Basic 5-fold cross-validation cvfit <- cv.missoNet(X = X, Y = Y, kfold = 5, verbose = 0) # Extract optimal estimates Beta.min <- cvfit$est.min$Beta Theta.min <- cvfit$est.min$Theta # Extract 1SE estimates (if computed) if (!is.null(cvfit$est.1se.beta)) { Beta.1se <- cvfit$est.1se.beta$Beta } if (!is.null(cvfit$est.1se.theta)) { Theta.1se <- cvfit$est.1se.theta$Theta } # Make predictions newX <- matrix(rnorm(10 * 12), 10, 12) pred.min <- predict(cvfit, newx = newX, s = "lambda.min") pred.1se <- predict(cvfit, newx = newX, s = "lambda.1se.beta") # Parallel cross-validation library(parallel) cl <- makeCluster(min(detectCores() - 1, 2)) cvfit2 <- cv.missoNet(X = X, Y = Y, kfold = 5, parallel = TRUE, cl = cl) stopCluster(cl) # Adaptive search for efficiency cvfit3 <- cv.missoNet(X = X, Y = Y, kfold = 5, adaptive.search = TRUE) # Reproducible CV with specific lambdas cvfit4 <- cv.missoNet(X = X, Y = Y, kfold = 5, lambda.beta = 10^seq(0, -2, length = 20), lambda.theta = 10^seq(0, -2, length = 20), seed = 486) # Plot CV results plot(cvfit, type = "heatmap") plot(cvfit, type = "scatter")sim <- generateData(n = 120, p = 12, q = 6, rho = 0.1) X <- sim$X; Y <- sim$Z # Basic 5-fold cross-validation cvfit <- cv.missoNet(X = X, Y = Y, kfold = 5, verbose = 0) # Extract optimal estimates Beta.min <- cvfit$est.min$Beta Theta.min <- cvfit$est.min$Theta # Extract 1SE estimates (if computed) if (!is.null(cvfit$est.1se.beta)) { Beta.1se <- cvfit$est.1se.beta$Beta } if (!is.null(cvfit$est.1se.theta)) { Theta.1se <- cvfit$est.1se.theta$Theta } # Make predictions newX <- matrix(rnorm(10 * 12), 10, 12) pred.min <- predict(cvfit, newx = newX, s = "lambda.min") pred.1se <- predict(cvfit, newx = newX, s = "lambda.1se.beta") # Parallel cross-validation library(parallel) cl <- makeCluster(min(detectCores() - 1, 2)) cvfit2 <- cv.missoNet(X = X, Y = Y, kfold = 5, parallel = TRUE, cl = cl) stopCluster(cl) # Adaptive search for efficiency cvfit3 <- cv.missoNet(X = X, Y = Y, kfold = 5, adaptive.search = TRUE) # Reproducible CV with specific lambdas cvfit4 <- cv.missoNet(X = X, Y = Y, kfold = 5, lambda.beta = 10^seq(0, -2, length = 20), lambda.theta = 10^seq(0, -2, length = 20), seed = 486) # Plot CV results plot(cvfit, type = "heatmap") plot(cvfit, type = "scatter")
Generates synthetic data from a conditional Gaussian graphical model with user-specified missing data mechanisms. This function is designed for simulation studies and testing of the missoNet package, supporting three types of missingness: Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR).
generateData( n, p, q, rho, missing.type = "MCAR", X = NULL, Beta = NULL, E = NULL, Theta = NULL, Sigma.X = NULL, Beta.row.sparsity = 0.2, Beta.elm.sparsity = 0.2, seed = NULL )generateData( n, p, q, rho, missing.type = "MCAR", X = NULL, Beta = NULL, E = NULL, Theta = NULL, Sigma.X = NULL, Beta.row.sparsity = 0.2, Beta.elm.sparsity = 0.2, seed = NULL )
n |
Integer. Sample size (number of observations). Must be at least 2. |
p |
Integer. Number of predictor variables. Must be at least 1. |
q |
Integer. Number of response variables. Must be at least 2. |
rho |
Numeric scalar or vector of length |
missing.type |
Character string specifying the missing data mechanism. One of:
|
X |
Optional |
Beta |
Optional |
E |
Optional |
Theta |
Optional |
Sigma.X |
Optional |
Beta.row.sparsity |
Numeric in [0, 1]. Proportion of rows in Beta that
contain at least one non-zero element. Default is 0.2. Only used when
|
Beta.elm.sparsity |
Numeric in [0, 1]. Proportion of non-zero elements
within active rows of Beta. Default is 0.2. Only used when |
seed |
Optional integer. Random seed for reproducibility. |
The function generates data through the following model:
where:
is the predictor matrix
is the coefficient matrix
is the error matrix
is the complete response matrix
Missing values are then introduced to create (the observed response
matrix with NAs) according to the specified mechanism:
MCAR: Each element has probability rho[j] of being missing,
independent of all variables.
MAR: Missingness depends on the predictors through a logistic model:
where is calibrated to achieve the target missing rate.
MNAR: The lowest rho[j] proportion of values in each column
are set as missing.
A list containing:
X |
|
Y |
|
Z |
|
Beta |
|
Theta |
|
rho |
Numeric vector of length |
missing.type |
Character string. The missing mechanism used. |
Yixiao Zeng [email protected], Celia M. T. Greenwood
missoNet for fitting models to data with missing values,
cv.missoNet for cross-validation
# Example 1: Basic usage with default settings sim.dat <- generateData(n = 300, p = 50, q = 20, rho = 0.1, seed = 857) # Check dimensions and missing rate dim(sim.dat$X) # 300 x 50 dim(sim.dat$Z) # 300 x 20 mean(is.na(sim.dat$Z)) # approximately 0.1 # Example 2: Variable missing rates with MAR mechanism rho.vec <- seq(0.05, 0.25, length.out = 20) sim.dat <- generateData(n = 300, p = 50, q = 20, rho = rho.vec, missing.type = "MAR") # Example 3: High sparsity in coefficient matrix sim.dat <- generateData(n = 500, p = 100, q = 30, rho = 0.15, Beta.row.sparsity = 0.1, # 10% active predictors Beta.elm.sparsity = 0.3) # 30% active in each row # Example 4: User-supplied matrices n <- 300; p <- 50; q <- 20 X <- matrix(rnorm(n*p), n, p) Beta <- matrix(rnorm(p*q) * rbinom(p*q, 1, 0.1), p, q) # 10% non-zero Theta <- diag(q) + 0.1 # Simple precision structure sim.dat <- generateData(X = X, Beta = Beta, Theta = Theta, n = n, p = p, q = q, rho = 0.2, missing.type = "MNAR") # Example 5: Use generated data with missoNet library(missoNet) sim.dat <- generateData(n = 400, p = 50, q = 10, rho = 0.15) # Split into training and test sets train.idx <- 1:300 test.idx <- 301:400 # Fit missoNet model fit <- missoNet(X = sim.dat$X[train.idx, ], Y = sim.dat$Z[train.idx, ], lambda.beta = 0.1, lambda.theta = 0.1) # Evaluate on test set pred <- predict(fit, newx = sim.dat$X[test.idx, ])# Example 1: Basic usage with default settings sim.dat <- generateData(n = 300, p = 50, q = 20, rho = 0.1, seed = 857) # Check dimensions and missing rate dim(sim.dat$X) # 300 x 50 dim(sim.dat$Z) # 300 x 20 mean(is.na(sim.dat$Z)) # approximately 0.1 # Example 2: Variable missing rates with MAR mechanism rho.vec <- seq(0.05, 0.25, length.out = 20) sim.dat <- generateData(n = 300, p = 50, q = 20, rho = rho.vec, missing.type = "MAR") # Example 3: High sparsity in coefficient matrix sim.dat <- generateData(n = 500, p = 100, q = 30, rho = 0.15, Beta.row.sparsity = 0.1, # 10% active predictors Beta.elm.sparsity = 0.3) # 30% active in each row # Example 4: User-supplied matrices n <- 300; p <- 50; q <- 20 X <- matrix(rnorm(n*p), n, p) Beta <- matrix(rnorm(p*q) * rbinom(p*q, 1, 0.1), p, q) # 10% non-zero Theta <- diag(q) + 0.1 # Simple precision structure sim.dat <- generateData(X = X, Beta = Beta, Theta = Theta, n = n, p = p, q = q, rho = 0.2, missing.type = "MNAR") # Example 5: Use generated data with missoNet library(missoNet) sim.dat <- generateData(n = 400, p = 50, q = 10, rho = 0.15) # Split into training and test sets train.idx <- 1:300 test.idx <- 301:400 # Fit missoNet model fit <- missoNet(X = sim.dat$X[train.idx, ], Y = sim.dat$Z[train.idx, ], lambda.beta = 0.1, lambda.theta = 0.1) # Evaluate on test set pred <- predict(fit, newx = sim.dat$X[test.idx, ])
Fit a penalized multi-task regression with a response-network ()
under missing responses. The method jointly estimates the coefficient matrix
and the precision matrix via penalized
likelihood with penalties on and the off-diagonal
entries of .
missoNet( X, Y, rho = NULL, GoF = "eBIC", lambda.beta = NULL, lambda.theta = NULL, lambda.beta.min.ratio = NULL, lambda.theta.min.ratio = NULL, n.lambda.beta = NULL, n.lambda.theta = NULL, beta.pen.factor = NULL, theta.pen.factor = NULL, penalize.diagonal = NULL, beta.max.iter = 10000, beta.tol = 1e-05, theta.max.iter = 10000, theta.tol = 1e-05, eta = 0.8, eps = 1e-08, standardize = TRUE, standardize.response = TRUE, relax.net = FALSE, adaptive.search = FALSE, parallel = FALSE, cl = NULL, verbose = 1 )missoNet( X, Y, rho = NULL, GoF = "eBIC", lambda.beta = NULL, lambda.theta = NULL, lambda.beta.min.ratio = NULL, lambda.theta.min.ratio = NULL, n.lambda.beta = NULL, n.lambda.theta = NULL, beta.pen.factor = NULL, theta.pen.factor = NULL, penalize.diagonal = NULL, beta.max.iter = 10000, beta.tol = 1e-05, theta.max.iter = 10000, theta.tol = 1e-05, eta = 0.8, eps = 1e-08, standardize = TRUE, standardize.response = TRUE, relax.net = FALSE, adaptive.search = FALSE, parallel = FALSE, cl = NULL, verbose = 1 )
X |
Numeric matrix ( |
Y |
Numeric matrix ( |
rho |
Optional numeric vector of length |
GoF |
Character. Goodness-of-fit criterion: |
lambda.beta, lambda.theta
|
Optional numeric vectors (or scalars).
Candidate regularization paths for |
lambda.beta.min.ratio, lambda.theta.min.ratio
|
Optional numerics in |
n.lambda.beta, n.lambda.theta
|
Optional integers. Lengths of automatically
generated lambda paths (ignored if the corresponding |
beta.pen.factor |
Optional |
theta.pen.factor |
Optional |
penalize.diagonal |
Logical or |
beta.max.iter, theta.max.iter
|
Integers. Max iterations for the
|
beta.tol, theta.tol
|
Numerics |
eta |
Numeric in |
eps |
Numeric in |
standardize |
Logical. Standardize columns of |
standardize.response |
Logical. Standardize columns of |
relax.net |
(Experimental) Logical. If |
adaptive.search |
(Experimental) Logical. Use adaptive two-stage lambda search? Default |
parallel |
Logical. Evaluate parts of the grid in parallel using a provided
cluster? Default |
cl |
Optional cluster from |
verbose |
Integer in |
The conditional Gaussian model is
where:
is the -th observation of responses
is the -th observation of predictors
is the coefficient matrix
is the precision matrix
is the intercept vector
The parameters are estimated by solving:
where is the negative log-likelihood.
Missing values in Y are accommodated through unbiased estimating equations
using column-wise observation probabilities. Internally, X and Y
may be standardized for numerical stability; returned estimates are re-scaled
back to the original units.
The grid search spans lambda.beta and lambda.theta. The optimal
pair is selected by the user-chosen goodness-of-fit criterion GoF:
"AIC", "BIC", or "eBIC" (default). If
adaptive.search = TRUE, a two-stage pre-optimization narrows the grid
before the main search (faster on large problems, with a small risk of missing
the global optimum).
A list of class "missoNet" with components:
List at the selected lambda pair:
Beta (), Theta (),
intercept mu (length ), lambda.beta, lambda.theta,
lambda.beta.idx, lambda.theta.idx, scalar gof
(AIC/BIC/eBIC at optimum), and (if requested) relax.net.
Length- vector of working missingness probabilities.
Unique lambda values explored along
the grid for and .
Logical indicating whether the diagonal of
was penalized.
Penalty factor matrices actually used.
List with fitting diagnostics:
n, p, q, standardize, standardize.response,
the vector of criterion values gof, and the evaluated grids
gof.grid.beta, gof.grid.theta (length equals number of fitted models).
Yixiao Zeng [email protected], Celia M. T. Greenwood
Zeng, Y., et al. (2025). Multivariate regression with missing response data for modelling regional DNA methylation QTLs. arXiv:2507.05990.
cv.missoNet for cross-validated selection;
generic methods such as plot() and predict() for objects of class
"missoNet".
sim <- generateData(n = 120, p = 10, q = 6, rho = 0.1) X <- sim$X; Y <- sim$Z # Fit with defaults (criterion = eBIC) fit1 <- missoNet(X, Y) # Extract the optimal estimates Beta.hat <- fit1$est.min$Beta Theta.hat <- fit1$est.min$Theta # Plot missoNet results plot(fit1, type = "heatmap") plot(fit1, type = "scatter") # Provide short lambda paths fit2 <- missoNet( X, Y, lambda.beta = 10^seq(0, -2, length.out = 5), lambda.theta = 10^seq(0, -2, length.out = 5), GoF = "BIC" ) # Test single lambda choice fit3 <- missoNet( X, Y, lambda.beta = 0.1, lambda.theta = 0.1, ) # De-biased network on the active set fit4 <- missoNet(X, Y, relax.net = TRUE, verbose = 0) # Adaptive search for large problems fit5 <- missoNet(X = X, Y = Y, adaptive.search = TRUE, verbose = 0) # Parallel (requires a cluster) library(parallel) cl <- makeCluster(2) fit_par <- missoNet(X, Y, parallel = TRUE, cl = cl, verbose = 0) stopCluster(cl)sim <- generateData(n = 120, p = 10, q = 6, rho = 0.1) X <- sim$X; Y <- sim$Z # Fit with defaults (criterion = eBIC) fit1 <- missoNet(X, Y) # Extract the optimal estimates Beta.hat <- fit1$est.min$Beta Theta.hat <- fit1$est.min$Theta # Plot missoNet results plot(fit1, type = "heatmap") plot(fit1, type = "scatter") # Provide short lambda paths fit2 <- missoNet( X, Y, lambda.beta = 10^seq(0, -2, length.out = 5), lambda.theta = 10^seq(0, -2, length.out = 5), GoF = "BIC" ) # Test single lambda choice fit3 <- missoNet( X, Y, lambda.beta = 0.1, lambda.theta = 0.1, ) # De-biased network on the active set fit4 <- missoNet(X, Y, relax.net = TRUE, verbose = 0) # Adaptive search for large problems fit5 <- missoNet(X = X, Y = Y, adaptive.search = TRUE, verbose = 0) # Parallel (requires a cluster) library(parallel) cl <- makeCluster(2) fit_par <- missoNet(X, Y, parallel = TRUE, cl = cl, verbose = 0) stopCluster(cl)
missoNet and cross-validated fitsVisualize either the cross-validation (CV) error surface or the
goodness-of-fit (GoF) surface over the –
grid for objects returned by missoNet or cv.missoNet.
Two display types are supported:
a 2D heatmap (default) and a 3D scatter surface.
## S3 method for class 'missoNet' plot( x, type = c("heatmap", "scatter"), detailed.axes = TRUE, plt.surf = TRUE, ... )## S3 method for class 'missoNet' plot( x, type = c("heatmap", "scatter"), detailed.axes = TRUE, plt.surf = TRUE, ... )
x |
A fitted object returned by |
type |
Character string specifying the plot type.
One of |
detailed.axes |
Logical; if |
plt.surf |
Logical; for |
... |
Additional graphical arguments forwarded to
|
This S3 method detects whether x contains cross-validation results and
chooses an appropriate plotting backend:
Heatmap: uses Heatmap with a
viridis-like color ramp (via colorRamp2). The
selected is outlined in white; 1-SE
choices (if present) are highlighted with dashed/dotted outlines.
Scatter: uses scatterplot3d
to draw the error/GoF surface on scales. When
plt.surf = TRUE, light lattice lines are added, and the minimum is
marked.
For type = "heatmap": a ComplexHeatmap Heatmap
object (invisibly drawn by ComplexHeatmap).
For type = "scatter": a "scatterplot3d" object,
returned invisibly.
CV objects (created by cv.missoNet or any
missoNet object that carries CV results): the color encodes the
mean CV error for each pair. The
minimum-error solution is outlined; if 1-SE solutions were
computed, they are also marked (dashed/dotted outlines).
Non-CV objects (created by missoNet without CV):
the color encodes the GoF value over the grid; the selected
minimum (best) solution is outlined.
For heatmaps, axes are the raw values; rows are
and columns are .
For 3D scatter plots, both axes are shown on the
scale for readability.
A viridis-like palette is used. Breaks are based on distribution quantiles of the CV error or GoF values to enhance contrast across the grid.
Requires ComplexHeatmap, circlize, scatterplot3d, and grid.
Yixiao Zeng [email protected], Celia M. T. Greenwood
missoNet, cv.missoNet,
Heatmap, scatterplot3d
sim <- generateData(n = 150, p = 10, q = 8, rho = 0.1, missing.type = "MCAR") ## Fit a model without CV (plots GoF surface) fit <- missoNet(X = sim$X, Y = sim$Z, verbose = 0) plot(fit, type = "heatmap") # GoF heatmap plot(fit, type = "scatter", plt.surf = TRUE) # GoF 3D scatter ## Cross-validation (plots CV error surface) cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, verbose = 0) plot(cvfit, type = "heatmap", detailed.axes = FALSE) plot(cvfit, type = "scatter", plt.surf = FALSE)sim <- generateData(n = 150, p = 10, q = 8, rho = 0.1, missing.type = "MCAR") ## Fit a model without CV (plots GoF surface) fit <- missoNet(X = sim$X, Y = sim$Z, verbose = 0) plot(fit, type = "heatmap") # GoF heatmap plot(fit, type = "scatter", plt.surf = TRUE) # GoF 3D scatter ## Cross-validation (plots CV error surface) cvfit <- cv.missoNet(X = sim$X, Y = sim$Z, verbose = 0) plot(cvfit, type = "heatmap", detailed.axes = FALSE) plot(cvfit, type = "scatter", plt.surf = FALSE)
missoNet modelsGenerate predicted responses for new observations from a fitted
missoNet (or cross-validated) model. The prediction at a given
regularization choice uses the fitted
intercept(s) and coefficient matrix :
## S3 method for class 'missoNet' predict( object, newx, s = c("lambda.min", "lambda.1se.beta", "lambda.1se.theta"), ... ) ## S3 method for class 'cv.missoNet' predict( object, newx, s = c("lambda.min", "lambda.1se.beta", "lambda.1se.theta"), ... )## S3 method for class 'missoNet' predict( object, newx, s = c("lambda.min", "lambda.1se.beta", "lambda.1se.theta"), ... ) ## S3 method for class 'cv.missoNet' predict( object, newx, s = c("lambda.min", "lambda.1se.beta", "lambda.1se.theta"), ... )
object |
A fitted |
newx |
Numeric matrix of predictors with |
s |
Character string selecting the stored solution; one of
|
... |
Ignored; included for S3 compatibility. |
This method does not modify or standardize newx. If the model was
trained with standardization, ensure that newx has been prepared in
the same way as the training data (same centering/scaling and column order).
A numeric matrix of predicted responses of dimension
. Row names are taken from newx (if any),
and column names are inherited from the fitted coefficient matrix (if any).
The s argument selects the stored solution:
"lambda.min" (default): the minimum CV error or selected
GoF solution, stored in object$est.min.
"lambda.1se.beta": the 1-SE solution favoring larger
, stored in object$est.1se.beta.
"lambda.1se.theta": the 1-SE solution favoring larger
, stored in object$est.1se.theta.
1-SE solutions are available only if the model was fit with
compute.1se = TRUE during training or cross-validation.
missoNet, cv.missoNet, plot.missoNet
sim <- generateData(n = 200, p = 8, q = 6, rho = 0.1, missing.type = "MCAR", seed = 123) tr <- 1:150 tst <- 151:200 ## Cross-validated fit, keeping 1-SE solutions cvfit <- cv.missoNet(X = sim$X[tr, ], Y = sim$Z[tr, ], kfold = 5, compute.1se = TRUE, verbose = 0) ## Predict on held-out set yhat_min <- predict(cvfit, newx = sim$X[tst, ], s = "lambda.min") yhat_b1se <- predict(cvfit, newx = sim$X[tst, ], s = "lambda.1se.beta") yhat_t1se <- predict(cvfit, newx = sim$X[tst, ], s = "lambda.1se.theta") dim(yhat_min) # 50 x qsim <- generateData(n = 200, p = 8, q = 6, rho = 0.1, missing.type = "MCAR", seed = 123) tr <- 1:150 tst <- 151:200 ## Cross-validated fit, keeping 1-SE solutions cvfit <- cv.missoNet(X = sim$X[tr, ], Y = sim$Z[tr, ], kfold = 5, compute.1se = TRUE, verbose = 0) ## Predict on held-out set yhat_min <- predict(cvfit, newx = sim$X[tst, ], s = "lambda.min") yhat_b1se <- predict(cvfit, newx = sim$X[tst, ], s = "lambda.1se.beta") yhat_t1se <- predict(cvfit, newx = sim$X[tst, ], s = "lambda.1se.theta") dim(yhat_min) # 50 x q