ESTIMATE gives a population-based estimate of the risk of death for hormone receptor-positive (HR+) breast cancer patients based on specific characteristics. Estimates are based on data obtained from the Surveillance, Epidemiology, and End Results (SEER) program. This tool should not be used to make risk predictions for individuals, but rather can be used to learn about mortality risks among given subgroups. For more information about this tool please visit the About section. Patients using this tool should consult their doctors.
Select among the following characteristics, and specify the timeframe for estimation. Click “Calculate” when ready. This button must be used to re-estimate.
The ESTIMATE (ESTImating Mortality in breAsT cancEr) tool provides a population-based, non-parametric estimate of the cumulative risk of breast cancer-specific mortality (BCSM), non-BCSM, and all-cause mortality for women with HR+ breast cancer based on specific characteristics, and for flexible periods of time after initial diagnosis. The user interface (UI) enables selection among several patient/tumor characteristics – including age at diagnosis, tumor size (T), nodal status (N), tumor grade, and the number of years survived since initial diagnosis – and the desired the estimation time frame, which is the number of years over which to estimate, subsequent to the years already survived. The 'Years survived since initial diagnosis' and 'Subsequent years over which to estimate' inputs are synergistic, such that the user can only estimate up to 20 years post-diagnosis. Based on the specified inputs, ESTIMATE subsets the full study cohort and estimates cumulative incidence of death in the chosen subgroup. Point estimates and plots of the cumulative incidence functions for BCSM, non-BCSM, and all-cause mortality are provided, each equipped with optional confidence intervals. Brief interpretive descriptions are also included with each set of output. This tool was developed using the shiny package in R.
Data were obtained from the National Cancer Institute’s SEER 18-registry (1973-2016) database. The cohort includes women with a microscopically confirmed HR+ (estrogen- or progesterone receptor-positive) breast cancer diagnosed between 1990 and 2006, such that there is a minimum of 10 years of potential follow-up time. Women with any cancers diagnosed prior to the eligible HR+ breast cancer were excluded, because cause of death in this SEER database is attributed to the first cancer, only; thus, if patients with prior cancers were included – and such a patient's cause of death was at all cancer related – it would be linked to their first cancer diagnosis in SEER regardless of whether or not that cancer was the true cause of death. In such a situation, if the actual cause of death was the HR+ breast cancer but attribution went to a previous cancer, then the breast cancer-related death would not be captured. So, including only patients whose first lifetime cancer diagnosis was the HR+ breast cancer of interest was required to capture deaths caused by HR+ breast cancer. However, patients may still have had any number of subsequent cancers, and we acknowledge the potential bias this poses to accurate attribution of cause of death as well. Please see Limitations for additional discussion. The figure to the right diagrams selection of the HR+ study cohort.
BCSM is defined as the interval from initial breast cancer diagnosis to death from breast cancer, and patients still alive are censored at last follow-up. Similarly, non-BCSM and all-cause mortality are defined as the intervals from initial diagnosis to death from causes other than breast cancer, or any cause (respectively), and patients still alive are censored at last follow-up. The cumulative incidence of BCSM and non-BCSM are estimated non-parametrically via the method of Gray (1988), as implemented in the cmprsk package in R. All-cause mortality was estimated by taking the complement of the Kaplan-Meier survival function, as implemented in the survival package. Due to the non-parametric approach used, it is possible to specify a subgroup and estimation time frame for which there are too few patients (and too few deaths) in SEER to yield reasonable estimates. For this reason, estimates are displayed only when there are at least 20 total deaths within the specified estimation time frame in the subgroup of interest; when there are fewer than 20 deaths, estimates are withheld and an “Inestimable” message is shown.
Refer to the demographics table below for descriptive statistics of the HR+ cohort.
Click to show demographics of the HR+ cohort
ESTIMATE has a number of important strengths, particularly when compared to other risk tools. First, the use of a large, nationally-representative database like SEER allows for greater statistical confidence in estimates, as well enhanced generalizability. Notably, 26% of the sample was aged ≥70 years at diagnosis (N = 68,786), which is an important group of patients who have been historically underrepresented in clinical trials1. Another important strength is the tool’s accommodation of user-defined time periods within 20 years, as opposed to providing rigid 0-10 or 0-20 year estimates. This allows for estimation of residual risks of BCSM and non-BCSM after having already survived a certain number of years since initial diagnosis, which is useful for understanding how risks may change over time. Finally, the tool can be valuable in regions of the world where genomic-based risk tools are either unavailable, or difficult to access.
There are several limitations to this tool that are important to consider. First, SEER does not collect data on treatments received, so treatment is not accounted for when estimating mortality. Instead, estimates can be thought of as averaged over all of the types of treatment being utilized for the given disease type during the period for which data are available, with the caveat that the treatments used today are somewhat different than those used during some of the time periods available for estimation; this is one of the primary reasons that the tool should not be used to make predictions for individual patients. Secondly, there is potential bias induced by the attribution of cause of death in the SEER database. As mentioned above in Data Source and Cohort Selection, the database used to extract the HR+ cohort attributes cancer-related deaths to the first cancer diagnosis, only. Such an attribution method is imperfect, and it is impossible to know for sure that the cause of death recorded is correct. Cohort selection was tailored to attempt to mitigate this issue, but surely it may remain to some extent. A third limitation is that, similar to treatments received, comorbidities and disease recurrences are not captured in SEER and therefore could not be accounted for in mortality estimation, even though both are likely to influence mortality. Additionally, all patients were diagnosed before 2010 and so their HER2 status and receipt of HER2-directed therapy are unknown. However, a prior study evaluating tumor subtypes reported that among all HR-positive breast cancers in SEER, only 13% were HER2+ and 87% were HER2-2, a distribution that is likely to be similar in the present cohort. Moreover, HER2 is unlikely to have a large role on late recurrence given the low risk of recurrence seen in years 5-10 for HR+, HER2+ breast cancer3.