Distinguishing aggressive cancers from non-aggressive or non-progressing cancers is an issue of both clinical
and public health importance particularly for those cancers with an available screening test. With respect to
breast cancer, mammographic screening has been shown in randomized trials to reduce breast cancer
mortality, but given the limitations of its sensitivity and specificity some breast cancers are missed by
screening. These so called interval detected breast cancers diagnosed between regular screenings are known
to have a more aggressive clinical profile. In addition, of those cancers detected by mammography some are
indolent while others are more likely to recur despite treatment. The pilot study proposed herein is highly
responsive to the EDRN supplement titled “Biomarkers to Distinguish Aggressive Cancers from Nonaggressive
or Non-progressing Cancers” in that it addresses both of the research objectives related to these
issues outlined in the notice for this supplement:
Aim 1: To identify biomarkers in tumor tissue related to risk of interval detected vs. mammography
screen detected breast cancer focusing on early stage invasive disease. We will compare gene
expression profiles using the whole genome-cDNA-mediated Annealing, Selection, extension and Ligation
(DASL) assay of 50 screen detected cancers to those of 50 interval detected cancers. Through this approach
we will advance our understanding of the molecular characteristics of interval vs. screen detected breast
cancers and discover novel biomarkers that distinguish between them.
Aim 2: To identify biomarkers in tumor tissue related to risk of cancer recurrence among patients with
screen detected early stage invasive breast cancer. Using the DASL assay we will compare gene
expression profiles from screen detected early stage breast cancer that either recurred within five years or
never recurred within five years. These two groups of patients will be matched on multiple factors including
tumor stage and treatments received. Our goal with this comparison is to identify novel biomarkers that
discriminate between tumors that recur and are more aggressive compared to those that are less aggressive
and do not recur.
This project will evaluate well characterized tumor tissue specimens using a robust high dimensional laboratory
approach and generate preliminary data that will motivate a larger scale study of high translational relevance.
Distinguishing aggressive cancers from non-aggressive or non-progressing cancers is an issue of both clinical
and public health importance particularly for those cancers with an available screening test. With respect to
breast cancer, mammographic screening has been shown in randomized trials to reduce breast cancer
mortality, but given the limitations of its sensitivity and specificity some breast cancers are missed by
screening. These so called interval detected breast cancers diagnosed between regular screenings are known
to have a more aggressive clinical profile. In addition, of those cancers detected by mammography some are
indolent while others are more likely to recur despite treatment. The pilot study proposed herein is highly
responsive to the EDRN supplement titled “Biomarkers to Distinguish Aggressive Cancers from Nonaggressive
or Non-progressing Cancers” in that it addresses both of the research objectives related to these
issues outlined in the notice for this supplement:
Aim 1: To identify biomarkers in tumor tissue related to risk of interval detected vs. mammography
screen detected breast cancer focusing on early stage invasive disease. We will compare gene
expression profiles using the whole genome-cDNA-mediated Annealing, Selection, extension and Ligation
(DASL) assay of 50 screen detected cancers to those of 50 interval detected cancers. Through this approach
we will advance our understanding of the molecular characteristics of interval vs. screen detected breast
cancers and discover novel biomarkers that distinguish between them.
Aim 2: To identify biomarkers in tumor tissue related to risk of cancer recurrence among patients with
screen detected early stage invasive breast cancer. Using the DASL assay we will compare gene
expression profiles from screen detected early stage breast cancer that either recurred within five years or
never recurred within five years. These two groups of patients will be matched on multiple factors including
tumor stage and treatments received. Our goal with this comparison is to identify novel biomarkers that
discriminate between tumors that recur and are more aggressive compared to those that are less aggressive
and do not recur.
This project will evaluate well characterized tumor tissue specimens using a robust high dimensional laboratory
approach and generate preliminary data that will motivate a larger scale study of high translational relevance.
screen detected breast cancer focusing on early stage invasive disease. We will compare gene
expression profiles using the whole genome-cDNA-mediated Annealing, Selection, extension and Ligation
(DASL) assay of 50 screen detected cancers to those of 50 interval detected cancers. Through this approach
we will advance our understanding of the molecular characteristics of interval vs. screen detected breast
cancers and discover novel biomarkers that distinguish between them.
Aim 2: To identify biomarkers in tumor tissue related to risk of cancer recurrence among patients with
screen detected early stage invasive breast cancer. Using the DASL assay we will compare gene
expression profiles from screen detected early stage breast cancer that either recurred within five years or
never recurred within five years. These two groups of patients will be matched on multiple factors including
tumor stage and treatments received. Our goal with this comparison is to identify novel biomarkers that
discriminate between tumors that recur and are more aggressive compared to those that are less aggressive
and do not recur.
Gene expression data will be pre-processed, normalized and cleaned as described in our protocol
(Appendix 1). We will perform two levels of analysis: gene-set level and gene level, to identify gene sets and
genes that are associated with interval vs. screen detected disease and those that are associated with
recurrent vs. non-recurrent disease. In general, we will account for multiple testing by controlling the false
discovery rate (FDR).23 We will use a 5% FDR when assessing statistical significance across all samples, and
a 1% FDR (or lower) when performing subgroup analyses. For our gene-level analysis, we will use linear
regression for each gene to identify genes showing differential expression in our comparisons of interest.
Matching variables are adjusted as covariates in the linear regression models. For gene set-level analyses, we
will first rank genes from high to low based on their association in each comparison, then for each gene set we
will calculate an enrichment score that reflects how much the gene set is represented with genes that
differentiate between our comparison groups (http://www.broad.mit.edu/GSEA).24,25 The statistical significance
of enrichment scores will be evaluated by calculating enrichment scores relative to each of the null distributions
formed by: 1) permuting exposure status within each matched set and 2) permuting genes. Using both types of
null distributions gives us gene sets associated with a given comparison as well as those particularly enriched
with associated genes. Performing the analysis in two tiers will help us to identify not only the genes that are
individually most likely to differentiate our comparison groups, but also those that may have only moderate
effect individually but collectively as a gene set may strongly discriminate between our comparison groups.
This will enhance the power to detect all the associated genes or gene sets.
We will construct a panel of genes that discriminate between our comparison groups based on
significantly associated genes or gene sets using regularization techniques, which have been shown to
improve prediction and interpretation considerably compared to ordinary regression models without
regularization. Specifically, we will use the elastic net26 regularization method as it has a desirable feature well
suited to our data, i.e., encouraging genes in the same pathway to be selected as a group in the model.27 10-
fold cross validation will be used to determine the appropriate amount of regularization, and hence the panel of
the biomarkers that are associated with the outcome.
While microarray technology allows simultaneous evaluation of expression levels of thousands of
genes, only a fraction is expected to be associated with the exposures of interest. With respect to statistical
power, the minimum detectable effect size (MDES) is determined by the number (m) of truly altered genes out
of the total p (here 25,000) genes studied, in additio
There are currently no biomarkers annotated for this protocol.
No datasets are currently associated with this protocol.