Available Software
The following software is available in the Microarray Core computer
facility. Follow the links to the official product website or a brief summary
of the software capabilities.
|
|
|
Affymetrix® Microarray Suite 5.0
MAS 5.0 provides the connection between the wet lab processing of microarrays
and statistical analysis of gene expression information. Probe-level analysis is first performed so that the gene
expression information may be accessed for gene-level
analysis.
- Probe-level Analysis
Affymetrix® Microarray Suite 5.0 provides the link between the
physical preparation of the microarray chips and the numerical expression
values for each gene that are used in statistical analysis. In this section,
we first give a brief background of Affymetrix® technology, and
then describe the transition from the initial scan to the intensities recorded
in the .CEL file. Please refer to the Expression Estimates section of this website
for a discussion of MAS' algorithm for obtaining gene-specific expression
estimates from the information in the .CEL file.
An Affymetrix® microarray (Lipshutz et al.,
1999; Lockhart et al., 1996) is well described as a two-dimensional matrix
whose elements, the oligonucleotides, are different combinations of 25 of A, C,
T, and G nucleotides. The elements of this matrix can be assembled into
smaller 2xJ matrices with each smaller matrix responsible for measuring a
particular mRNA transcript of interest. The first row of each 2xJ matrix
contains J Perfect Match (PM) oligonucleotides, each PM sequence being
perfectly complementary to a region of a particular mRNA transcript sequence
that is (hopefully) not shared by the other mRNA transcripts under
investigation. The second row contains J Mismatch (MM) sequences, the MM being
identical to its column PM partner except for a single nucleotide variation at
the central (13th) position. The vectors of PM/MM pairs are synthesized
directly on a glass slide through photolithography with very high density to
produce a small physical tool that can be used to measure tens of thousands of
gene expression levels simultaneously.
Given the constructed microarray, the target mRNA from the cell sample of
interest is fragmented, fluorescently labeled, and washed over the array.
Ideally, the components of the target mRNA sample will specifically hybridize
to the intended PM strand. The MM strand serves as a control for non-specific
cross-hybridization. The fluorescence levels at each PM and MM are then
measured using a scanning microscope.
MAS software outputs detected fluorescence levels at each probe. When the
scanning microscope reads fluorescence activity, it dissects each probe cell
(that is, each individual PM and MM cell) into a 6x6 square of pixels, and
records the fluorescence levels at each pixel in a file with extension .DAT.
(The .DAT files are not, at this time, made available to users of the
Microarray Core facility.) To summarize the fluorescence at each cell, MAS
ignores the 20 border pixels and returns the 75th percentile fluorescence level
of the remaining 16 pixels as the Average Intensity of the probe cell, and
records these values in the .CEL file.
- Gene-level Analysis
The major utility of Affymetrix® Microarray Suite 5.0 software is
the output of fluorescence intensity values for each probe in the .CEL file.
If the investigator wishes to use the expression estimates from the .CHP file,
then MAS 5.0 can also perform some comparative analyses of two sample types by
fold change analysis.
Third-Party Analysis Software
Several software packages are available in the DFCI Microarray Core Computing
Facility (room 214A of the Jimmy Fund Building) for the statistical analysis of
microarray data. Below are brief descriptions of the capabilities of
dChip,
Bioconduictor,
Cluster/TreeView,
GeneCluster,
GeneSpring, and
GeneSight.
- dChip
Below are some of the major analytic capabilities of dChip software:
- Normalization
- Model-based expression intensities (PM-only model, PM-MM model)
- Gene filtering by sd/mean and the presence call percent
- Hierarchical clustering of genes
- Hierarchical clustering of samples
- Finding genes with high/low fold change/difference, and some set operations
- Finding genes with a similar time-course expression pattern to a given
gene (correlation)
- Finding significant GeneOntology, ProteinDomain, or Cytoband similarities
in a given gene list
- Bioconductor
Bioconductor is a set of R packages for use in bioinformatics (requires R 1.5.0
or later). The user must be familiar with programming in R. The most relevant
packages can be downloaded in a Bioconductor bundle from the Bioconductor website. These packages
allow for analyses such as gene filtering, plotting microarray data, correcting
for multiple testing, and annotation for microarrays.
- Cluster/TreeView
Cluster and TreeView are made available through Mike Eisen's lab at
Stanford. Cluster performs hierarchical clustering, k-means clustering, and
self-organizing map construction. The dendrogram results for hierarchical
clustering can be viewed in the complementary software TreeView.
- GeneCluster
GeneCluster is made available through the Whitehead Institute at MIT. It
performs self-organizing map analysis, and is demonstrated in Tamayo et al., 1999.
- GeneSpring
GeneSpring is commercially available through Silicon Genetics. It
performs hierarchical clustering, gene list construction by similar expression
pattern, functional classification, regulatory sequence searching, and some
pathway construction.
- GeneSight
GeneSight is commercially available through Biodiscovery. It performs
gene filtering, hierarchical clustering, k-means clustering, self-organizing
map construction, and subset analysis.
|
|
|