FAQ's

Basic information about microarrays and Links to other microarray resources

The Basics


What is a microarray? What can I do with them?

You can find some general information here.

What papers should I read to understand microarrays?
See the Related Sites page.

Microarray Products


Where can I get the microarrays?

There are several suppliers of microarrays. One supplier that we recommend is Agilent; we have significant experience with Agilent microarrays.

What genes are represented on the arrays?
See Agilent Technical Support Page.

Using Arrays


Can you help me design experiments with microarrays?

Design controls carefully, replicate your results, and treat all your samples as identically as possible. Microarray experiments offer some unique challenges with variations introduced by biology of the samples, sample preparation, and technical analysis. Therefore, we suggest you talk to someone experienced with microarray analysis and a bioinformaticist to confirm the design of your experiment to ensure statistically meaningful results. If you need help with your microarray experiments and referral to experts, please email Michael Topal at mdtopal@gmail.com.

Can I use total RNA instead of poly A+?
Yes.

Analyzing Arrays


Can you help me with analyzing my arrays?

We provide a database (UNC MD) for data storage and array analysis that should help you get through the initial stages of analysis. For more information on setting up an account and using UNC MD contact one of the microarray database curators . Currently we do not provide custom analysis services, but several analysis packages are available for free at UNC, including Genespring and Ingenuity Pathway Assist (IPA). Additional help is available from members of the biostatistics and informatics groups at UNC.

What does it mean to 'normalize' a two-color array?
Because two-color arrays involve the use of two independent RNA samples that might vary slightly in concentration, and two separate labeling reactions which might have proceeded at different efficiencies, the overall brightness of the signal from one sample is often brighter than that from the other. If this is not corrected for, the data will be skewed to indicate that the sample which had more RNA or better labeling had higher expression levels (on average) in every spot. Normalization refers to the correction. There are at least three different methods which have been used for normalizing arrays:

1. Measure the total intensity of the two colors, and use the overall ratio as a correction factor. The correction factor is applied to each ratio. This method is based on the assumption that the average gene expression ratio on your array is 1. If your experimental manipulation is expected to cause more than 10-20% of all genes to change expression level in one direction, this method will not be reliable. The advantage of this method is that you can do it on any array. This is the only method currently implemented in our analysis software.

2. Use the expression ratios of spiked control RNAS. This procedure requires that your arrays include a number of control cDNAs which will not normally be present in your sample. For example, if you are doing a human array, some bacterial clones might be selected. Equal amounts of these RNAs are then 'spiked' into your samples. The correction factor is calculated as in the previous method, but only the control spots are used in the calculation. The drawback to this method is the extra effort it takes to prepare the controls.

3. Use housekeeping genes which are presumed not to change as controls. This is an extension of the procedure many people use when doing quantitative Northern blots - actin or GADPH are commonly used in that application. Obviously this method is only as good as your confidence that the genes you selected really don't change expression level, and for this reason most people don't use this method.

How do I know when a change in expression is 'significant'?
This is a difficult question to answer, and one that an increasing number of statisticians are investigating. The main problem in determining significance is the (typically) small number of replicate measurements. In addition, in spotted arrays we have found that most genes show a large variance in expression level, and this variability seems to be related to the state of the sample (i.e., it is biological in nature) or due to differences in sample handling.

It is important to keep in mind that analyzing array data for 'changed genes' is basically a game of deciding whether false positives or false negatives are more costly. It is difficult to provide a meaningful cutoff such as "2-fold changes are significant", because other factors must be considered such as how confident you are in the individual measurements which make up a ratio. Since the goal of most users is to find candidate genes to study, a better procedure is to rank genes with the assistance of some kind of confidence measure, add a good dose of biological knowledge, and take things from there.