In this vignette, we illustrate how to apply FLORAL
to
fit a Cox model with longitudinal microbiome data. Due to limited
availability of public data sets with survival information, we use
simulated data for illustrative purposes.
We will use the built-in simulation function simu()
to
generate longitudinal compositional features and the corresponding
time-to-event. The underlying methodology used for the simulation is
based on a piece-wise exponential distribution as described by Hendry 2014.
By default, the first 10 features out of the 500 features simulated below are associated with the time-to-event.
simdat <- simu(n=200, # sample size
p=500, # number of features
model="timedep",
pct.sparsity = 0.8, # proportion of zeros
rho=0, # feature-wise correlation
longitudinal_stability = TRUE # choose to simulate longitudinal features with stable trajectories
)
With the simulated data, the log-ratio lasso Cox model with time-dependent features can be fitted by running the following function. Here we provide a detailed description on each arguments:
longitudinal = TRUE
such that
the algorithm would use the appropriate method to handle longitudinal
data.x
should be the count matrix
where rows specify samples and columns specify features.x
should be input as id
.x
should be input as tobs
.Surv
object (Surv(time,status)
) of
unique patients should be input as y
.
Please note that the survival data should be sorted with respect to the
IDs specified in id
.
fit <- FLORAL(x=simdat$xcount,
y=Surv(simdat$data_unique$t,simdat$data_unique$d),
family="cox",
longitudinal = TRUE,
id = simdat$data$id,
tobs = simdat$data$t0,
progress=FALSE,
plot=TRUE)
fit$selected
#> $min
#> [1] "taxa1" "taxa2" "taxa27" "taxa366" "taxa38" "taxa5" "taxa6"
#> [8] "taxa8" "taxa9"
#>
#> $`1se`
#> [1] "taxa1" "taxa5" "taxa6" "taxa8" "taxa9"
#>
#> $min.2stage
#> [1] "taxa2" "taxa366" "taxa38" "taxa5" "taxa6" "taxa8" "taxa9"
#>
#> $`1se.2stage`
#> [1] "taxa1" "taxa5" "taxa6" "taxa8" "taxa9"
The list of selected features is saved in fit$selected
as shown above.
To appropriately prepare the data in practice, we have the following recommendations:
Surv
object for input as
y
.id
and tobs
. Save the
feature table for input as x
.