Overview

The MASTA algorithm is a semi-supervised learning method and it requires three input data files.

  • A long form longitudinal data for predicting the time-to-event outcomes [longitudinal]

  • A follow up time data to inform the length of follow-up time for each patient [follow_up_time]

  • A labeled data with time-to-event outcomes and baseline predictors [survival]

In Step I of the MASTA algorithm, longitudinal and follow_up_time will be used to extract features from estimated subject-specific intensity functions of individual encounters. In Step II of the MASTA algorithm, survival and follow_up_time will be used to train and evaluate risk prediction models with survival outcomes. The MASTA package contains these three data files as a sample.

library(MASTA)
## Loading required package: survival

Longitudinal Encounter Data

?longitudinal
head(longitudinal)
##   code id time
## 1    1  4    1
## 2    1  4    1
## 3    1  4    1
## 4    1  4    2
## 5    1  5    4
## 6    1  5    4
table(longitudinal$code)
## 
##      1      2      3 
## 168374  68242  21501

Follow-up Time Data with the training/validation indicator

One subject has one record in this data. The variable train_valid indicates which cohort each subject belong to, training (1) or validation (2).

?follow_up_time
head(follow_up_time)
##   id  fu_time train_valid
## 1  1 49.41273           1
## 2  2 13.93018           1
## 3  3 12.55031           1
## 4  4 14.85010           1
## 5  5 80.65708           1
## 6  6 42.64476           1

Time-to-Event Data

One subject has one record in this data. The variable event_ind indicates whether the subject has an event (1) or not (0). For those who do not have event (i.e., event_ind=0), event_time in this data set should be the same as the fu_time in follow_up_time. The current version of the MASTA package requires that at least one baseline predictor is included in this data.

?survival
head(survival)
##   id event_ind event_time cov_1 cov_2 cov_3
## 1  1         1    9.36345    79     1     0
## 2  2         0   13.93018    81     0     0
## 3  3         0   12.55031    55     1     1
## 4  4         0   14.85010    72     1     0
## 5  5         0   80.65708    83     1     1
## 6  6         1   15.70431    47     1     0