| We propose a
novel analysis algorithm MAT to reliably detect regions enriched by
transcription factor Chromatin ImmunoPrecipitation (ChIP) on Affymetrix
tiling arrays (chip). MAT models the baseline probe behavior by
considering probe sequence and copy number on each array. The
correlation between the baseline probe model estimates and the observed
measurements can be as high as 0.72. MAT standardizes the probe value
via the probe model, eliminating the need for sample normalization. A
novel scoring function is applied to the standardized data to identify
the ChIP-enriched regions, which allows robust p-value and false
discovery rate calculations. MAT can detect ChIP-regions from a single
ChIP sample, multiple ChIP samples, or multiple ChIP samples with
controls with increasing accuracy. Based on the mock ChIP samples
provided by the ENCODE consortium, MAT achieved 100% accuracy (0 false
positive and 0 false negative) for the target detection of those
spike-in plasmids, which are 2,4,8,-256 fold enriched compared with the
genomic background. Quantitatively, MAT yielded a 0.95 correlation
coefficient between the spike-in DNA concentration and the predicted
score. Upon further analysis, MAT identified more than 70% of the true
targets at 5% FDR cutoff from a single ChIP sample. This is a valuable
feature for quickly testing the protocols and antibodies for ChIP-chip,
and easily identifying ChIP-chip samples with questionable quality. MAT requires four types of input files: the Affymetrix .cel files which contain the signal value of every probe (sample); the .bpmap library files which contain the sequence, locations (on the array and on the genome), and copy number of each probe(sample); the repeat-library file which contains the chromosome coordinates of RepeatMasker repeats, simple repeats and segmental duplication (sample). The MAT parameters (including the grouping of the .cel and .bpmap files) are then organized into a user-edited .tag (Tiling Array Group) file (sample). MAT returns two types output files: the .bar files which contain the MAT score for each probe which can be imported to Affymetrix Integrated Genome Browser (IGB) for visualization; a .bed file with the chromosomal coordinates of all the ChIP-regions with MAT score and repeat (including segmental duplications) flag which can be loaded into UCSC Genome Browser. References:
|

|
|
|
Subscribe to MAT.announce
with your
academic email address for the download password and notification of new releases and bug fixes. |
Questions, comments, suggestions or bug report, please
contact
mat.support at gmail.com
Last updated by Wei Li on
Thursday, May 17, 2007