We use the built-in dataset bladder1_recforest for this
example. We build two subsamples of initial data for training and
testing the model.
data("bladder1_recforest")
id_individuals_bladder1_recforest <- unique(bladder1_recforest$id)
train_ids <- sample(id_individuals_bladder1_recforest, size = 100, replace = FALSE)
test_ids <- setdiff(id_individuals_bladder1_recforest, train_ids)
train_bladder1_recforest <- bladder1_recforest %>%
  filter(id %in% train_ids)
test_bladder1_recforest <- bladder1_recforest %>%
  filter(id %in% test_ids)Hyperparameters are user-fixed (to be optimized in real-world
settings). Considering the small number of predictors, mtry
was set to 2. For further details on hyperparameters, call
?train_forest.
set.seed(1234)
trained_forest <- train_forest(
  data = train_bladder1_recforest,
  id_var = "id",
  covariates = c("treatment", "number", "size"),
  time_vars = c("t.start", "t.stop"),
  death_var = "death",
  event = "event",
  n_trees = 3,
  n_bootstrap = round(2 * length(train_ids) / 3),
  mtry = 2,
  minsplit = 3,
  nodesize = 15,
  method = "NAa",
  min_score = 5,
  max_nodes = 20,
  seed = 111,
  parallel = FALSE,
  verbose = FALSE
)Predictions from recforest model are the expected mean cumulative number of recurrent events for each individual at the end of follow-up. Evaluations on new data based on the 3 metrics (C-index for recurrent events, Integrated MSE for recurrent events and Integrated Score for recurrent events) will be available soon.