Available Software

The following software is available in the Microarray Core computer facility. Follow the links to the official product website or a brief summary of the software capabilities.

Affymetrix® Microarray Suite 5.0 website summary
dChip website summary
Bioconductor website summary
Cluster/TreeView website summary
GeneCluster website summary
GeneSpring website summary
GeneSight website summary

Affymetrix® Microarray Suite 5.0

MAS 5.0 provides the connection between the wet lab processing of microarrays and statistical analysis of gene expression information. Probe-level analysis is first performed so that the gene expression information may be accessed for gene-level analysis.

  • Probe-level Analysis

    Affymetrix® Microarray Suite 5.0 provides the link between the physical preparation of the microarray chips and the numerical expression values for each gene that are used in statistical analysis. In this section, we first give a brief background of Affymetrix® technology, and then describe the transition from the initial scan to the intensities recorded in the .CEL file. Please refer to the Expression Estimates section of this website for a discussion of MAS' algorithm for obtaining gene-specific expression estimates from the information in the .CEL file.

    An Affymetrix® microarray (Lipshutz et al., 1999; Lockhart et al., 1996) is well described as a two-dimensional matrix whose elements, the oligonucleotides, are different combinations of 25 of A, C, T, and G nucleotides. The elements of this matrix can be assembled into smaller 2xJ matrices with each smaller matrix responsible for measuring a particular mRNA transcript of interest. The first row of each 2xJ matrix contains J Perfect Match (PM) oligonucleotides, each PM sequence being perfectly complementary to a region of a particular mRNA transcript sequence that is (hopefully) not shared by the other mRNA transcripts under investigation. The second row contains J Mismatch (MM) sequences, the MM being identical to its column PM partner except for a single nucleotide variation at the central (13th) position. The vectors of PM/MM pairs are synthesized directly on a glass slide through photolithography with very high density to produce a small physical tool that can be used to measure tens of thousands of gene expression levels simultaneously.

    Given the constructed microarray, the target mRNA from the cell sample of interest is fragmented, fluorescently labeled, and washed over the array. Ideally, the components of the target mRNA sample will specifically hybridize to the intended PM strand. The MM strand serves as a control for non-specific cross-hybridization. The fluorescence levels at each PM and MM are then measured using a scanning microscope.

    MAS software outputs detected fluorescence levels at each probe. When the scanning microscope reads fluorescence activity, it dissects each probe cell (that is, each individual PM and MM cell) into a 6x6 square of pixels, and records the fluorescence levels at each pixel in a file with extension .DAT. (The .DAT files are not, at this time, made available to users of the Microarray Core facility.) To summarize the fluorescence at each cell, MAS ignores the 20 border pixels and returns the 75th percentile fluorescence level of the remaining 16 pixels as the Average Intensity of the probe cell, and records these values in the .CEL file.

  • Gene-level Analysis

    The major utility of Affymetrix® Microarray Suite 5.0 software is the output of fluorescence intensity values for each probe in the .CEL file. If the investigator wishes to use the expression estimates from the .CHP file, then MAS 5.0 can also perform some comparative analyses of two sample types by fold change analysis.

Third-Party Analysis Software

Several software packages are available in the DFCI Microarray Core Computing Facility (room 214A of the Jimmy Fund Building) for the statistical analysis of microarray data. Below are brief descriptions of the capabilities of dChip, Bioconduictor, Cluster/TreeView, GeneCluster, GeneSpring, and GeneSight.

  • dChip

    Below are some of the major analytic capabilities of dChip software:

    1. Normalization
    2. Model-based expression intensities (PM-only model, PM-MM model)
    3. Gene filtering by sd/mean and the presence call percent
    4. Hierarchical clustering of genes
    5. Hierarchical clustering of samples
    6. Finding genes with high/low fold change/difference, and some set operations
    7. Finding genes with a similar time-course expression pattern to a given gene (correlation)
    8. Finding significant GeneOntology, ProteinDomain, or Cytoband similarities in a given gene list

  • Bioconductor

    Bioconductor is a set of R packages for use in bioinformatics (requires R 1.5.0 or later). The user must be familiar with programming in R. The most relevant packages can be downloaded in a Bioconductor bundle from the Bioconductor website. These packages allow for analyses such as gene filtering, plotting microarray data, correcting for multiple testing, and annotation for microarrays.

  • Cluster/TreeView

    Cluster and TreeView are made available through Mike Eisen's lab at Stanford. Cluster performs hierarchical clustering, k-means clustering, and self-organizing map construction. The dendrogram results for hierarchical clustering can be viewed in the complementary software TreeView.

  • GeneCluster

    GeneCluster is made available through the Whitehead Institute at MIT. It performs self-organizing map analysis, and is demonstrated in Tamayo et al., 1999.

  • GeneSpring

    GeneSpring is commercially available through Silicon Genetics. It performs hierarchical clustering, gene list construction by similar expression pattern, functional classification, regulatory sequence searching, and some pathway construction.

  • GeneSight

    GeneSight is commercially available through Biodiscovery. It performs gene filtering, hierarchical clustering, k-means clustering, self-organizing map construction, and subset analysis.