Degree Name

BS in Statistics


Statistics Department


Rebecca Ottesen and Hunter Glanz



Background Nationally, the 5-year survival rate for patients with breast cancer is relatively higher than patients diagnosed with other types of cancer. In addition to the higher survival rates, breast cancer patients also tend to have increased rates of lost to follow-up as compared to other cancers. When a patient becomes lost, the occurrence of distant metastasis cannot be reliably ascertained, unless the patient had a breast cancer-specific death. As a consequence of the missing information from lost patients, results from statistical analyses that contain lost patients may not adequately reflect the actual recurrence and disease-free survival rates. The impact of lost patients on the unadjusted and adjusted disease-free survival (DFS) was explored in breast cancer patients seen at the City of Hope National Medical Center in Los Angeles from 1997 to 2012.

Methods Female breast cancer patients with stage 0, I, II, or III at diagnosis were included in the analyses (N = 2,358). Of these patients, 1,937 were deemed non-lost and 421 were lost. Kaplan-Meier estimates for DFS were stratified by lost status. Cox proportional hazard models were built to adjust for multiple predictors such as age group at diagnosis, race, comorbidity score, stage at diagnosis, health insurance type, employment status, and lymphovascular invasion (LVI). Patients were separated into 20 groups based on propensity scores from a logistic model using the variables categorical distance between the patient’s residence and the City of Hope, age at diagnosis, stage at diagnosis, insurance type, hormone receptor status, and her2/neu status to predict the probability of being lost. Lost patients were then removed from their assigned propensity score group and replaced with simulated lost patients from the corresponding propensity score group. Simulated lost patients were sampled with replacement from the non-lost patients within each group and then information from different assessment periods were removed from those patients. The new 5-year DFS rate and hazard ratios were calculated. The process of simulating lost patients and recalculating the 5-year DFS and hazard ratio was bootstrapped 1,000 times

ResultsThe 5-year DFS was 95.1% for lost patients and 84.6% for non-lost patients. Adjusting for age, race, comorbidity score, stage, insurance, employment, and LVI, the risk of death or recurrence is 61.5% lower for lost patients compared to non-lost patients (HR = .385, P

Conclusion A higher than average number of assessments needed to be lost to capture the disease-free survival rates of the actual lost patients. This indicates that the differences in disease-free survival rates between non-lost and actual lost patients is not only due to missing information, but also that lost patients may actually be healthier than their non-lost counterparts— which could be a reason that the patients stopped following-up at City of Hope.