Files
cDNA-image-processing/参考资料/NewGridAndCV/html/R14_MicroarrayImage_CaseStudy.html
T
Serendipity b8a8ff2bc6 feat: cDNA微阵列图像处理作业 - Python实现
实现内容:
- 网格划分:投影分析 + 自相关估周期 + 白顶帽去背景 + 质心提取
- 三种阈值分割:人工阈值、Otsu自动阈值、迭代阈值
- TV去噪(Chambolle投影算法)
- 后处理:去小连通域 + 保留最大连通域
- 完整可视化:网格叠加、阈值对比、收敛曲线、分割结果

参考MATLAB代码:NewGridAndCV/demo_GriddingAndCV.m
2026-05-06 19:41:26 +08:00

892 lines
42 KiB
HTML

<html xmlns:mwsh="http://www.mathworks.com/namespace/mcode/v1/syntaxhighlight.dtd">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<!--
This HTML is auto-generated from an M-file.
To make changes, update the M-file and republish this document.
-->
<title>Microarray Spot Finding Example</title>
<meta name="generator" content="MATLAB 7.0">
<meta name="date" content="2004-08-09">
<meta name="m-file" content="R14_MicroarrayImage_CaseStudy"><style>
body {
background-color: white;
margin:10px;
}
h1 {
color: #990000;
font-size: x-large;
}
h2 {
color: #990000;
font-size: medium;
}
p.footer {
text-align: right;
font-size: xx-small;
font-weight: lighter;
font-style: italic;
color: gray;
}
pre.codeinput {
margin-left: 30px;
}
span.keyword {color: #0000FF}
span.comment {color: #228B22}
span.string {color: #A020F0}
span.untermstring {color: #B20000}
span.syscmd {color: #B28C00}
pre.showbuttons {
margin-left: 30px;
border: solid black 2px;
padding: 4px;
background: #EBEFF3;
}
pre.codeoutput {
color: gray;
font-style: italic;
}
pre.error {
color: red;
}
/* Make the text shrink to fit narrow windows, but not stretch too far in
wide windows. On Gecko-based browsers, the shrink-to-fit doesn't work. */
p,h1,h2,div {
/* for MATLAB's browser */
width: 600px;
/* for Mozilla, but the "width" tag overrides it anyway */
max-width: 600px;
/* for IE */
width:expression(document.body.clientWidth > 620 ? "600px": "auto" );
}
</style></head>
<body>
<h1>Microarray Spot Finding Example</h1>
<introduction>
<p>This example shows a simple method for locating spots on a microarray and extracting the intensties of the spots. It can be
downloaded from <b>MATLAB Central</b>. <a href="http://www.mathworks.com/matlabcentral">http://www.mathworks.com/matlabcentral</a></p>
</introduction>
<h2>Contents</h2>
<div>
<ul>
<li><a href="#1">Start with clean slate</a></li>
<li><a href="#2">Read image file</a></li>
<li><a href="#3">Crop specified region</a></li>
<li><a href="#4">Display red &amp; green layers</a></li>
<li><a href="#5">Convert RGB image to grayscale for spot finding</a></li>
<li><a href="#6">Create horizontal profile</a></li>
<li><a href="#7">Estimate spot spacing by autocorrelation</a></li>
<li><a href="#8">Remove background morphologically</a></li>
<li><a href="#9">Segment peaks</a></li>
<li><a href="#10">Locate centers</a></li>
<li><a href="#11">Determine divisions between spots</a></li>
<li><a href="#12">Transpose and repeat</a></li>
<li><a href="#13">Put bounding boxes around each spot</a></li>
<li><a href="#14">Segment spots from background by thresholding</a></li>
<li><a href="#15">Apply logarithmic transformation then threshold intensities</a></li>
<li><a href="#16">Try local thresholding instead</a></li>
<li><a href="#17">Logically combine local and global thresholds</a></li>
<li><a href="#18">Fill holes to solidify spots</a></li>
<li><a href="#19">Label spot masks by bounding box</a></li>
<li><a href="#20">Extract first spot for measurement</a></li>
<li><a href="#21">Measure spot intensity &amp; releative expression level</a></li>
<li><a href="#22">Remove background, calculate again and compare measurements</a></li>
<li><a href="#23">Set up graphical display for results</a></li>
<li><a href="#24">Repeat measurement for all spots</a></li>
<li><a href="#25">Export spot data to Excel spreadsheet</a></li>
</ul>
</div>
<h2>Start with clean slate<a name="1"></a></h2><pre class="codeinput">clear <span class="comment">%empty workspace (no variables)</span>
close <span class="string">all</span> <span class="comment">%no figures</span>
clc <span class="comment">%empty command window</span>
</pre><h2>Read image file<a name="2"></a></h2>
<p>MATLAB can read many standard image formats including TIFF, GIF and BMP using the <tt>imread</tt> command. In addition, the <b>Image Procesing Toolbox</b> provides support for working with specialized image file formats such as DICOM. This microarray image was stored as a J-PEG
file. The image is much larger than the screen size, so <tt>imshow</tt> scales it down to fit and let's you know with a warning message.
</p><pre class="codeinput">x = imread(<span class="string">'MicroArraySlide.JPG'</span>);
imageSize = size(x)
screenSize = get(0,<span class="string">'ScreenSize'</span>)
iptsetpref(<span class="string">'ImshowBorder'</span>,<span class="string">'tight'</span>)
imshow(x)
title(<span class="string">'original image'</span>)
</pre><pre class="codeoutput">imageSize =
3248 1248 3
screenSize =
1 1 1024 768
Warning: Image is too big to fit on screen; displaying at 42% scale.
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_01.png"> <h2>Crop specified region<a name="3"></a></h2>
<p>Next we use <tt>imcrop</tt> to extract a region of interest. You can repeat this for all print-tip blocks for a full microarray study.
</p><pre class="codeinput">y = imcrop(x,[622 2467 220 227]);
f1 = figure(<span class="string">'position'</span>,[40 46 285 280]);
imshow(y)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_02.png"> <h2>Display red &amp; green layers<a name="4"></a></h2>
<p>This image was stored in RGB format. We are only interested in the red and green planes. To extract the red plane, simply
index layer 1. For the green plane, layer 2. Custom colormaps make visualization more intuitive. Notice that spot shapes are
not necessarily the same in both colors.
</p><pre class="codeinput">f2 = figure(<span class="string">'position'</span>,[265 163 647 327]);
subplot(121)
redMap = gray(256);
redMap(:,[2 3]) = 0;
subimage(y(:,:,1),redMap)
axis <span class="string">off</span>
title(<span class="string">'red (layer 1)'</span>)
subplot(122)
greenMap = gray(256);
greenMap(:,[1 3]) = 0;
subimage(y(:,:,2),greenMap)
axis <span class="string">off</span>
title(<span class="string">'green (layer 2)'</span>)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_03.png"> <h2>Convert RGB image to grayscale for spot finding<a name="5"></a></h2>
<p>Initially we care more about where the spots are located than their red and green intensities. Converting from RGB color to
grayscale allows us to focus first on spot locations.
</p><pre class="codeinput">z = rgb2gray(y);
figure(f1)
imshow(z)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_04.png"> <h2>Create horizontal profile<a name="6"></a></h2>
<p>We are looking for a regular grid of spots so we start by looking at the mean intensity for each column of the image. This
will help us identify where the centres of the spots are and where the gaps between the spots can be found.
</p><pre class="codeinput">xProfile = mean(z);
f2 = figure(<span class="string">'position'</span>,[39 346 284 73]);
plot(xProfile)
title(<span class="string">'horizontal profile'</span>)
axis <span class="string">tight</span>
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_05.png"> <h2>Estimate spot spacing by autocorrelation<a name="7"></a></h2>
<p>Ideally the spots would be periodicaly spaced consistently printed, but in practice they tend to have different sizes and
intensities, so the horizontal profile is irregular. We can use autocorrelation to enhance the self similarity of the profile.
The smooth result promotes peak finding and estimation of spot spacing. The <b>Signal Processing Toolbox</b> allows easy computation of the autocorrelation function using the <tt>xcov</tt> command.
</p><pre class="codeinput">ac = xcov(xProfile); <span class="comment">%unbiased autocorrelation</span>
f3 = figure(<span class="string">'position'</span>,[-3 427 569 94]);
plot(ac)
s1 = diff(ac([1 1:end])); <span class="comment">%left slopes</span>
s2 = diff(ac([1:end end])); <span class="comment">%right slopes</span>
maxima = find(s1&gt;0 &amp; s2&lt;0); <span class="comment">%peaks</span>
estPeriod = round(median(diff(maxima))) <span class="comment">%nominal spacing</span>
hold <span class="string">on</span>
plot(maxima,ac(maxima),<span class="string">'r^'</span>)
hold <span class="string">off</span>
title(<span class="string">'autocorrelation of profile'</span>)
axis <span class="string">tight</span>
</pre><pre class="codeoutput">estPeriod =
19
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_06.png"> <h2>Remove background morphologically<a name="8"></a></h2>
<p>We can use the spacing estimate to help design a filter to remove the background noise from the intensity profile. We do this
with the <tt>imtophat</tt> function from the <b>Image Processing Toolbox</b>. The <tt>strel</tt> command creates a simple rectangular 1D window or line shaped structuring element.
</p><pre class="codeinput">seLine = strel(<span class="string">'line'</span>,estPeriod,0);
xProfile2 = imtophat(xProfile,seLine);
f4 = figure(<span class="string">'position'</span>,[40 443 285 76]);
plot(xProfile2)
title(<span class="string">'enhanced horizontal profile'</span>)
axis <span class="string">tight</span>
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_07.png"> <h2>Segment peaks<a name="9"></a></h2>
<p>Now that we have clean and anchored gaps between the peaks, we can number each peak region with the <tt>bwlabel</tt> command. These regions were segmented by thresholding with <tt>im2bw</tt>. The threshold value was automatically determined by statistical properties of the data using <tt>graythresh</tt>. This is a good example of image processing techniques are often useful for 1D data analysis.
</p><pre class="codeinput">level = graythresh(xProfile2/255)*255
bw = im2bw(xProfile2/255,level/255);
L = bwlabel(bw);
f5 = figure(<span class="string">'position'</span>,[40 540 285 70]);
plot(L)
axis <span class="string">tight</span>
title(<span class="string">'labelled regions'</span>)
</pre><pre class="codeoutput">level =
16
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_08.png"> <h2>Locate centers<a name="10"></a></h2>
<p>We can extract the centroids of the peaks. These correspond to the horizontal centres of the spots. This is a common blob
analysis or feature extraction task that can be done with <tt>regionprops</tt>.
</p><pre class="codeinput">stats = regionprops(L);
centroids = [stats.Centroid];
xCenters = centroids(1:2:end)
figure(f5)
hold <span class="string">on</span>
plot(xCenters,1:max(L),<span class="string">'ro'</span>)
hold <span class="string">off</span>
title(<span class="string">'region centers'</span>)
</pre><pre class="codeoutput">xCenters =
Columns 1 through 8
23.0000 41.0000 60.0000 79.0000 98.0000 119.5000 137.0000 156.0000
Columns 9 through 10
175.0000 193.5000
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_09.png"> <h2>Determine divisions between spots<a name="11"></a></h2>
<p>The midpoints between adjacent peaks provides grid point locations.</p><pre class="codeinput">gap = diff(xCenters)/2;
first = xCenters(1)-gap(1);
xGrid = round([first xCenters(1:end)+gap([1:end end])])
figure(f2)
<span class="keyword">for</span> i=1:length(xGrid)
line(xGrid(i)*[1 1],ylim,<span class="string">'color'</span>,<span class="string">'m'</span>)
<span class="keyword">end</span>
title(<span class="string">'vertical separators'</span>)
</pre><pre class="codeoutput">xGrid =
14 32 51 70 89 109 128 147 166 184 203
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_10.png"> <h2>Transpose and repeat<a name="12"></a></h2>
<p>We just did the analysis on the vertical grid. Now we want to do the same for the horizontal spacing. To do this, we simply
transpose the image and repeat all the steps used above. This time without intermediate graphics display commands in order
to summarize the mathematical steps of this algorithm.
</p><pre class="codeinput">yProfile = mean(z'); <span class="comment">%peak profile</span>
ac = xcov(yProfile); <span class="comment">%cross correlation</span>
p1 = diff(ac([1 1:end]));
p2 = diff(ac([1:end end]));
maxima = find(p1&gt;0 &amp; p2&lt;0); <span class="comment">%peak locations</span>
estPeriod = round(median(diff(maxima))) <span class="comment">%spacing estimate</span>
seLine = strel(<span class="string">'line'</span>,estPeriod,0);
yProfile2 = imtophat(yProfile,seLine); <span class="comment">%background removed</span>
level = graythresh(yProfile2/255); <span class="comment">%automatic threshold level</span>
bw = im2bw(yProfile2/255,level); <span class="comment">%binarized peak regions</span>
L = bwlabel(bw); <span class="comment">%labeled regions</span>
stats = regionprops(L);
centroids = [stats.Centroid]; <span class="comment">%centroids</span>
yCenters = centroids(1:2:end) <span class="comment">%Y parts only</span>
gap = diff(yCenters)/2; <span class="comment">%inner region half widths</span>
first = yCenters(1)-gap(1);
<span class="comment">% list defining vertical boundaries between spot regions</span>
yGrid = round([first yCenters(1:end)+gap([1:end end])])
</pre><pre class="codeoutput">estPeriod =
20
yCenters =
Columns 1 through 8
24.0000 43.5000 64.0000 83.5000 104.5000 123.5000 144.5000 164.0000
Columns 9 through 10
183.0000 203.5000
yGrid =
14 34 54 74 94 114 134 154 174 193 214
</pre><h2>Put bounding boxes around each spot<a name="13"></a></h2>
<p>We have now found the rectangular grid. Using pairs of neighboring grid points we can form bounding box regions to address
each spot individually. The position and size coordinates of each bounding box were tabulated for convenience into a 4-column
matrix called <tt>ROI</tt>, which stands for regions of interest.
</p><pre class="codeinput">figure(f1)
imshow(z)
line(xGrid'*[1 1],yGrid([1 end]),<span class="string">'color'</span>,<span class="string">'b'</span>)
line(xGrid([1 end]),yGrid'*[1 1],<span class="string">'color'</span>,<span class="string">'b'</span>)
[X,Y] = meshgrid(xGrid(1:end-1),yGrid(1:end-1));
[dX,dY] = meshgrid(diff(xGrid),diff(yGrid));
ROI = [X(:) Y(:) dX(:) dY(:)];
<span class="comment">% first few rows of ROI table</span>
ROI(1:5,:)
</pre><pre class="codeoutput">ans =
14 14 18 20
14 34 18 20
14 54 18 20
14 74 18 20
14 94 18 20
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_11.png"> <h2>Segment spots from background by thresholding<a name="14"></a></h2>
<p>Applying a single threshold level to the whole image so all spots are detected equally is generally a good idea. However,
in this case is doesn't work so well due to large differences in spot brightness.
</p><pre class="codeinput">fSpots = figure(<span class="string">'position'</span>,[265 163 647 327]);
subplot(121)
imshow(z)
title(<span class="string">'gray image'</span>)
subplot(122)
bw = im2bw(z,graythresh(z));
imshow(bw)
title(<span class="string">'global threshold'</span>)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_12.png"> <h2>Apply logarithmic transformation then threshold intensities<a name="15"></a></h2>
<p>One way to equalize large variations in magnitude is by transforming intensity values to logarithmic space. This works much
better but some weak spots are still missed.
</p><pre class="codeinput">figure(fSpots)
subplot(121)
z2 = uint8(log(double(z)+1)/log(255)*255);
imshow(z2)
title(<span class="string">'log intensity'</span>)
subplot(122)
bw = im2bw(z2,graythresh(z2));
imshow(bw)
title(<span class="string">'global threshold'</span>)
</pre><pre class="codeoutput">Warning: Conversion rounded non-integer floating point value to nearest uint8 value.
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_13.png"> <h2>Try local thresholding instead<a name="16"></a></h2>
<p>Alternatively, the bounding boxes can be used to determine local threshold values for each spot. The code is a little more
sophisticated, requiring looping and indexing. Unfortunately, the results are mixed. Weak spots showed up well but spots with
bright perimeters were as bad as the original global threshold before log space transformation.
</p><pre class="codeinput">figure(fSpots)
subplot(122)
bw = false(size(z));
<span class="keyword">for</span> i=1:length(ROI)
rows = round(ROI(i,2))+[0:(round(ROI(i,4))-1)];
cols = round(ROI(i,1))+[0:(round(ROI(i,3))-1)];
spot = z(rows,cols);
bw(rows,cols) = im2bw(spot,graythresh(spot));
<span class="keyword">end</span>
imshow(bw)
title(<span class="string">'local threshold'</span>)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_14.png"> <h2>Logically combine local and global thresholds<a name="17"></a></h2>
<p>Since both have their merits, let's combine the best of both approaches. This can be done using logial operation on the binary
masks. These spot segmentation results are indeed much better.
</p><pre class="codeinput">figure(fSpots)
subplot(121)
bw = im2bw(z2,graythresh(z2));
<span class="keyword">for</span> i=1:length(ROI)
rows = round(ROI(i,2))+[0:(round(ROI(i,4))-1)];
cols = round(ROI(i,1))+[0:(round(ROI(i,3))-1)];
spot = z(rows,cols);
bw(rows,cols) = bw(rows,cols) | im2bw(spot,graythresh(spot));
<span class="keyword">end</span>
imshow(bw)
title(<span class="string">'combined threshold'</span>)
subplot(122)
imshow(z)
title(<span class="string">'linear intensity'</span>)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_15.png"> <h2>Fill holes to solidify spots<a name="18"></a></h2>
<p>The silhouettes of some spots still contained pinholes. The whole image could be filled using a single call to <tt>imfill</tt> but this may not be a good idea. Notice that some spots run together. If four mutually adjacent spots (sharing a common corner)
were all joined at their edges then a single function call would incorrectly fill in the common corner as well. To avoid that
possibility, it's good insurance to fill each spot one bounding box region at a time by looping. Indeed, the spot segmentation
now looks quite good.
</p><pre class="codeinput">figure(fSpots)
subplot(121)
warning <span class="string">off</span> <span class="string">MATLAB:intConvertOverflow</span>
<span class="keyword">for</span> i=1:length(ROI)
rows = round(ROI(i,2))+[0:(round(ROI(i,4))-1)];
cols = round(ROI(i,1))+[0:(round(ROI(i,3))-1)];
bw(rows,cols) = imfill(bw(rows,cols),<span class="string">'holes'</span>);
<span class="keyword">end</span>
imshow(bw)
title(<span class="string">'filled pinholes'</span>)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_16.png"> <h2>Label spot masks by bounding box<a name="19"></a></h2>
<p>If the gridding went well, all spots should be a single color. The results here are pretty good. There is still room for improvement.</p>
<p>TODO List:</p>
<div>
<ul>
<li>Due to slightly irregular spacing, for some spots a few pixels were mislabeled. With additional processing, the algorithm
could be extended to reclassify these stray pixels.
</li>
</ul>
</div>
<div>
<ul>
<li>The crescent shaped spot in row 8, column 4 could be completed to be more circular by using the 'ConvexImage' return value
from <tt>regionprops</tt>.
</li>
</ul>
</div>
<div>
<ul>
<li>The few stray pixels that are not attached to any spots could be removed as well.</li>
</ul>
</div>
<p>However, in this case the spot segmentation is good enough to proceed.</p><pre class="codeinput">L = zeros(size(bw));
<span class="keyword">for</span> i=1:length(ROI)
rows = ROI(i,2)+[0:(ROI(i,4)-1)];
cols = ROI(i,1)+[0:(ROI(i,3)-1)];
rectMask = L(rows,cols);
spotMask = bw(rows,cols);
rectMask(spotMask) = i;
L(rows,cols) = rectMask;
<span class="keyword">end</span>
map = [0 0 0; 0.5+0.5*rand(length(ROI),3)];
figure(f1)
imshow(L+1,map)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_17.png"> <h2>Extract first spot for measurement<a name="20"></a></h2>
<p>We will now examine the first spot closely to see how we can measure its red and green intensities, and ultimately quantify
its gene expression value. The measurement technique can then be repeated for all spots.
</p><pre class="codeinput">rect = ROI(1,:); <span class="comment">%[X Y dX dY]</span>
spot = imcrop(y,rect); <span class="comment">%region around spot</span>
figure(f1)
imshow(spot,<span class="string">'notruesize'</span>)
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_18.png"> <h2>Measure spot intensity &amp; releative expression level<a name="21"></a></h2>
<p>We now simply calculate the nominal intensity over the spot for both the red and green layers. A measure of gene expression
level can then be calculated from the two color intensities. Here a simple log-ratio measurement is shown. Other more robust
measures could be used instead. You could also perform some analysis of the quality of the spot.
</p><pre class="codeinput">mask = imcrop(L,rect)==1;
<span class="keyword">for</span> i=1:2
layer = spot(:,:,i);
intensity(i) = double(median(layer(mask)));
<span class="keyword">end</span>
intensity
expressionLevel = log(intensity(1)/intensity(2))
</pre><pre class="codeoutput">intensity =
32 81
expressionLevel =
-0.9287
</pre><h2>Remove background, calculate again and compare measurements<a name="22"></a></h2>
<p>If you noticed, the background intensity around the spot was not zero. This could bias results. To see how much difference
it makes, we can perform background subtraction around all spots, again using <tt>imtophat</tt> but this time in 2D on the image using a disk shaped structuring element. Then we can calculate color intensities and relative
expression level again to see what effect background bias had on the measurement. In this case the measurement shows more
downregulation with background removed.
</p><pre class="codeinput">seDisk = strel(<span class="string">'disk'</span>,round(estPeriod));
spot2 = imtophat(spot,seDisk);
<span class="keyword">for</span> i=1:2
layer = spot2(:,:,i);
intensity(i) = double(median(layer(mask)));
<span class="keyword">end</span>
intensity
expressionLevel = log(intensity(1)/intensity(2))
</pre><pre class="codeoutput">intensity =
14 70
expressionLevel =
-1.6094
</pre><h2>Set up graphical display for results<a name="23"></a></h2>
<p>It is helpful to see red and green intensity values overlayed onto the respective color images to gain confidence that measured
intensities make sense. It is also be helpful to overlay quantitative expression levels onto the original image to provide
additional visual assurance of measurement results. The rectangular grid also helps correlate measured values between images.
The flexibility of MATLAB's powerful <b>Handle Graphics</b> engine allow custom graphics like this to be set up quickly and easily.
</p><pre class="codeinput">f7 = figure(<span class="string">'position'</span>,[52 94 954 425]);
ax(1) = subplot(121);
subimage(y(:,:,1),redMap)
title(<span class="string">'red intensity'</span>)
ax(2) = subplot(122);
subimage(y(:,:,2),greenMap)
title(<span class="string">'green intensity'</span>)
f8 = figure(<span class="string">'position'</span>,[316 34 482 497]);
ax(3) = get(imshow(y,<span class="string">'notruesize'</span>),<span class="string">'parent'</span>);
title(<span class="string">'gene expression'</span>)
<span class="keyword">for</span> i=1:3
axes(ax(i))
axis <span class="string">off</span>
line(xGrid'*[1 1],yGrid([1 end]),<span class="string">'color'</span>,0.5*[1 1 1])
line(xGrid([1 end]),yGrid'*[1 1],<span class="string">'color'</span>,0.5*[1 1 1])
<span class="keyword">end</span>
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_19.png"> <img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_20.png"> <h2>Repeat measurement for all spots<a name="24"></a></h2>
<p>We now repeat the spot extraction and intensity calculation for all the spots in the grid. Here the measured values were tabulated
as additional columns beside the ROI positions for each spot into a new matrix called <tt>spotData</tt>.
</p><pre class="codeinput">figure(f7), figure(f8)
spotData = [ROI zeros(length(ROI),5)];
<span class="keyword">for</span> i=1:length(ROI)
spot = imcrop(y,ROI(i,:)); <span class="comment">%raw image</span>
spot2 = imtophat(spot,seDisk); <span class="comment">%background removed</span>
mask = imcrop(L,ROI(i,:))==i; <span class="comment">%spot mask</span>
<span class="keyword">for</span> j=1:2
layer = spot2(:,:,j); <span class="comment">%color layer</span>
intensity(j) = double(median(layer(mask)));
text(ROI(i,1)+ROI(i,3)/2,ROI(i,2)+ROI(i,4)/2,sprintf(<span class="string">'%.0f'</span>,intensity(j)),<span class="keyword">...</span>
<span class="string">'color'</span>,<span class="string">'y'</span>,<span class="string">'HorizontalAlignment'</span>,<span class="string">'center'</span>,<span class="string">'parent'</span>,ax(j))
rawLayer = spot(:,:,j);
rawIntensity(j) = double(median(layer(mask)));
<span class="keyword">end</span>
expression = log(intensity(1)/intensity(2));
text(ROI(i,1)+ROI(i,3)/2,ROI(i,2)+ROI(i,4)/2,sprintf(<span class="string">'%.2f'</span>,expression),<span class="keyword">...</span>
<span class="string">'color'</span>,<span class="string">'w'</span>,<span class="string">'HorizontalAlignment'</span>,<span class="string">'center'</span>,<span class="string">'parent'</span>,ax(3))
drawnow
spotData(i,5:9) = [intensity(:)' expression rawIntensity(:)'];
<span class="keyword">end</span>
</pre><img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_21.png"> <img vspace="5" hspace="5" src="R14_MicroarrayImage_CaseStudy_22.png"> <h2>Export spot data to Excel spreadsheet<a name="25"></a></h2>
<p>MATLAB can write to many standard formats. We will use <tt>xlswrite</tt> to save the <tt>spotData</tt> to an Excel workbook.
</p>
<p>TODO list:</p>
<div>
<ul>
<li>prepend column names first (see <tt>xlswrite</tt> doc example using cell arrays)
</li>
</ul>
</div>
<div>
<ul>
<li>programmatically open spreadsheet in Excel (see <tt>winopen</tt> doc)
</li>
</ul>
</div><pre class="codeinput">xlswrite(<span class="string">'microarray.xls'</span>,spotData)
</pre><p class="footer"><br>
Published with MATLAB&reg; 7.0<br></p>
<!--
##### SOURCE BEGIN #####
%% Microarray Spot Finding Example
% This example shows a simple method for locating spots on a microarray and
% extracting the intensties of the spots. It can be downloaded from *MATLAB
% Central*.
% http://www.mathworks.com/matlabcentral
%% Start with clean slate
clear %empty workspace (no variables)
close all %no figures
clc %empty command window
%% Read image file
% MATLAB can read many standard image formats including TIFF, GIF and BMP
% using the |imread| command. In addition, the *Image Procesing Toolbox*
% provides support for working with specialized image file formats such as
% DICOM. This microarray image was stored as a J-PEG file. The image is
% much larger than the screen size, so |imshow| scales it down to fit and
% let's you know with a warning message.
x = imread('MicroArraySlide.JPG');
imageSize = size(x)
screenSize = get(0,'ScreenSize')
iptsetpref('ImshowBorder','tight')
imshow(x)
title('original image')
%% Crop specified region
% Next we use |imcrop| to extract a region of interest. You can repeat this
% for all print-tip blocks for a full microarray study.
y = imcrop(x,[622 2467 220 227]);
f1 = figure('position',[40 46 285 280]);
imshow(y)
%% Display red & green layers
% This image was stored in RGB format. We are only interested in the red
% and green planes. To extract the red plane, simply index layer 1. For the
% green plane, layer 2. Custom colormaps make visualization more intuitive.
% Notice that spot shapes are not necessarily the same in both colors.
f2 = figure('position',[265 163 647 327]);
subplot(121)
redMap = gray(256);
redMap(:,[2 3]) = 0;
subimage(y(:,:,1),redMap)
axis off
title('red (layer 1)')
subplot(122)
greenMap = gray(256);
greenMap(:,[1 3]) = 0;
subimage(y(:,:,2),greenMap)
axis off
title('green (layer 2)')
%% Convert RGB image to grayscale for spot finding
% Initially we care more about where the spots are located than their red
% and green intensities. Converting from RGB color to grayscale allows us
% to focus first on spot locations.
z = rgb2gray(y);
figure(f1)
imshow(z)
%% Create horizontal profile
% We are looking for a regular grid of spots so we start by looking at the
% mean intensity for each column of the image. This will help us identify
% where the centres of the spots are and where the gaps between the spots
% can be found.
xProfile = mean(z);
f2 = figure('position',[39 346 284 73]);
plot(xProfile)
title('horizontal profile')
axis tight
%% Estimate spot spacing by autocorrelation
% Ideally the spots would be periodicaly spaced consistently printed, but
% in practice they tend to have different sizes and intensities, so the
% horizontal profile is irregular. We can use autocorrelation to enhance
% the self similarity of the profile. The smooth result promotes peak
% finding and estimation of spot spacing. The *Signal Processing Toolbox*
% allows easy computation of the autocorrelation function using the |xcov|
% command.
ac = xcov(xProfile); %unbiased autocorrelation
f3 = figure('position',[-3 427 569 94]);
plot(ac)
s1 = diff(ac([1 1:end])); %left slopes
s2 = diff(ac([1:end end])); %right slopes
maxima = find(s1>0 & s2<0); %peaks
estPeriod = round(median(diff(maxima))) %nominal spacing
hold on
plot(maxima,ac(maxima),'r^')
hold off
title('autocorrelation of profile')
axis tight
%% Remove background morphologically
% We can use the spacing estimate to help design a filter to remove the
% background noise from the intensity profile. We do this with the
% |imtophat| function from the *Image Processing Toolbox*. The |strel|
% command creates a simple rectangular 1D window or line shaped structuring
% element.
seLine = strel('line',estPeriod,0);
xProfile2 = imtophat(xProfile,seLine);
f4 = figure('position',[40 443 285 76]);
plot(xProfile2)
title('enhanced horizontal profile')
axis tight
%% Segment peaks
% Now that we have clean and anchored gaps between the peaks, we can number
% each peak region with the |bwlabel| command. These regions were segmented
% by thresholding with |im2bw|. The threshold value was automatically
% determined by statistical properties of the data using |graythresh|. This
% is a good example of image processing techniques are often useful for 1D
% data analysis.
level = graythresh(xProfile2/255)*255
bw = im2bw(xProfile2/255,level/255);
L = bwlabel(bw);
f5 = figure('position',[40 540 285 70]);
plot(L)
axis tight
title('labelled regions')
%% Locate centers
% We can extract the centroids of the peaks. These correspond to the
% horizontal centres of the spots. This is a common blob analysis or
% feature extraction task that can be done with |regionprops|.
stats = regionprops(L);
centroids = [stats.Centroid];
xCenters = centroids(1:2:end)
figure(f5)
hold on
plot(xCenters,1:max(L),'ro')
hold off
title('region centers')
%% Determine divisions between spots
% The midpoints between adjacent peaks provides grid point locations.
gap = diff(xCenters)/2;
first = xCenters(1)-gap(1);
xGrid = round([first xCenters(1:end)+gap([1:end end])])
figure(f2)
for i=1:length(xGrid)
line(xGrid(i)*[1 1],ylim,'color','m')
end
title('vertical separators')
%% Transpose and repeat
% We just did the analysis on the vertical grid. Now we want to do the same
% for the horizontal spacing. To do this, we simply transpose the image and
% repeat all the steps used above. This time without intermediate graphics
% display commands in order to summarize the mathematical steps of this
% algorithm.
yProfile = mean(z'); %peak profile
ac = xcov(yProfile); %cross correlation
p1 = diff(ac([1 1:end]));
p2 = diff(ac([1:end end]));
maxima = find(p1>0 & p2<0); %peak locations
estPeriod = round(median(diff(maxima))) %spacing estimate
seLine = strel('line',estPeriod,0);
yProfile2 = imtophat(yProfile,seLine); %background removed
level = graythresh(yProfile2/255); %automatic threshold level
bw = im2bw(yProfile2/255,level); %binarized peak regions
L = bwlabel(bw); %labeled regions
stats = regionprops(L);
centroids = [stats.Centroid]; %centroids
yCenters = centroids(1:2:end) %Y parts only
gap = diff(yCenters)/2; %inner region half widths
first = yCenters(1)-gap(1);
% list defining vertical boundaries between spot regions
yGrid = round([first yCenters(1:end)+gap([1:end end])])
%% Put bounding boxes around each spot
% We have now found the rectangular grid. Using pairs of neighboring grid
% points we can form bounding box regions to address each spot
% individually. The position and size coordinates of each bounding box were
% tabulated for convenience into a 4-column matrix called |ROI|, which
% stands for regions of interest.
figure(f1)
imshow(z)
line(xGrid'*[1 1],yGrid([1 end]),'color','b')
line(xGrid([1 end]),yGrid'*[1 1],'color','b')
[X,Y] = meshgrid(xGrid(1:end-1),yGrid(1:end-1));
[dX,dY] = meshgrid(diff(xGrid),diff(yGrid));
ROI = [X(:) Y(:) dX(:) dY(:)];
% first few rows of ROI table
ROI(1:5,:)
%% Segment spots from background by thresholding
% Applying a single threshold level to the whole image so all spots are
% detected equally is generally a good idea. However, in this case is
% doesn't work so well due to large differences in spot brightness.
fSpots = figure('position',[265 163 647 327]);
subplot(121)
imshow(z)
title('gray image')
subplot(122)
bw = im2bw(z,graythresh(z));
imshow(bw)
title('global threshold')
%% Apply logarithmic transformation then threshold intensities
% One way to equalize large variations in magnitude is by transforming
% intensity values to logarithmic space. This works much better but some
% weak spots are still missed.
figure(fSpots)
subplot(121)
z2 = uint8(log(double(z)+1)/log(255)*255);
imshow(z2)
title('log intensity')
subplot(122)
bw = im2bw(z2,graythresh(z2));
imshow(bw)
title('global threshold')
%% Try local thresholding instead
% Alternatively, the bounding boxes can be used to determine local
% threshold values for each spot. The code is a little more sophisticated,
% requiring looping and indexing. Unfortunately, the results are mixed.
% Weak spots showed up well but spots with bright perimeters were as bad as
% the original global threshold before log space transformation.
figure(fSpots)
subplot(122)
bw = false(size(z));
for i=1:length(ROI)
rows = round(ROI(i,2))+[0:(round(ROI(i,4))-1)];
cols = round(ROI(i,1))+[0:(round(ROI(i,3))-1)];
spot = z(rows,cols);
bw(rows,cols) = im2bw(spot,graythresh(spot));
end
imshow(bw)
title('local threshold')
%% Logically combine local and global thresholds
% Since both have their merits, let's combine the best of both approaches.
% This can be done using logial operation on the binary masks. These spot
% segmentation results are indeed much better.
figure(fSpots)
subplot(121)
bw = im2bw(z2,graythresh(z2));
for i=1:length(ROI)
rows = round(ROI(i,2))+[0:(round(ROI(i,4))-1)];
cols = round(ROI(i,1))+[0:(round(ROI(i,3))-1)];
spot = z(rows,cols);
bw(rows,cols) = bw(rows,cols) | im2bw(spot,graythresh(spot));
end
imshow(bw)
title('combined threshold')
subplot(122)
imshow(z)
title('linear intensity')
%% Fill holes to solidify spots
% The silhouettes of some spots still contained pinholes. The whole image
% could be filled using a single call to |imfill| but this may not be a
% good idea. Notice that some spots run together. If four mutually adjacent
% spots (sharing a common corner) were all joined at their edges then a
% single function call would incorrectly fill in the common corner as well.
% To avoid that possibility, it's good insurance to fill each spot one
% bounding box region at a time by looping. Indeed, the spot segmentation
% now looks quite good.
figure(fSpots)
subplot(121)
warning off MATLAB:intConvertOverflow
for i=1:length(ROI)
rows = round(ROI(i,2))+[0:(round(ROI(i,4))-1)];
cols = round(ROI(i,1))+[0:(round(ROI(i,3))-1)];
bw(rows,cols) = imfill(bw(rows,cols),'holes');
end
imshow(bw)
title('filled pinholes')
%% Label spot masks by bounding box
% If the gridding went well, all spots should be a single color. The
% results here are pretty good. There is still room for improvement.
%
% TODO List:
%
% * Due to slightly irregular spacing, for some spots a few pixels were
% mislabeled. With additional processing, the algorithm could be extended
% to reclassify these stray pixels.
%
% * The crescent shaped spot in row 8, column 4 could be completed to be
% more circular by using the 'ConvexImage' return value from |regionprops|.
%
% * The few stray pixels that are not attached to any spots could be
% removed as well.
%
% However, in this case the spot segmentation is good enough to proceed.
L = zeros(size(bw));
for i=1:length(ROI)
rows = ROI(i,2)+[0:(ROI(i,4)-1)];
cols = ROI(i,1)+[0:(ROI(i,3)-1)];
rectMask = L(rows,cols);
spotMask = bw(rows,cols);
rectMask(spotMask) = i;
L(rows,cols) = rectMask;
end
map = [0 0 0; 0.5+0.5*rand(length(ROI),3)];
figure(f1)
imshow(L+1,map)
%% Extract first spot for measurement
% We will now examine the first spot closely to see how we can measure its
% red and green intensities, and ultimately quantify its gene expression
% value. The measurement technique can then be repeated for all spots.
rect = ROI(1,:); %[X Y dX dY]
spot = imcrop(y,rect); %region around spot
figure(f1)
imshow(spot,'notruesize')
%% Measure spot intensity & releative expression level
% We now simply calculate the nominal intensity over the spot for both the
% red and green layers. A measure of gene expression level can then be
% calculated from the two color intensities. Here a simple log-ratio
% measurement is shown. Other more robust measures could be used instead.
% You could also perform some analysis of the quality of the spot.
mask = imcrop(L,rect)==1;
for i=1:2
layer = spot(:,:,i);
intensity(i) = double(median(layer(mask)));
end
intensity
expressionLevel = log(intensity(1)/intensity(2))
%% Remove background, calculate again and compare measurements
% If you noticed, the background intensity around the spot was not zero.
% This could bias results. To see how much difference it makes, we can
% perform background subtraction around all spots, again using |imtophat|
% but this time in 2D on the image using a disk shaped structuring element.
% Then we can calculate color intensities and relative expression level
% again to see what effect background bias had on the measurement. In this
% case the measurement shows more downregulation with background removed.
seDisk = strel('disk',round(estPeriod));
spot2 = imtophat(spot,seDisk);
for i=1:2
layer = spot2(:,:,i);
intensity(i) = double(median(layer(mask)));
end
intensity
expressionLevel = log(intensity(1)/intensity(2))
%% Set up graphical display for results
% It is helpful to see red and green intensity values overlayed onto the
% respective color images to gain confidence that measured intensities make
% sense. It is also be helpful to overlay quantitative expression levels
% onto the original image to provide additional visual assurance of
% measurement results. The rectangular grid also helps correlate measured
% values between images. The flexibility of MATLAB's powerful *Handle
% Graphics* engine allow custom graphics like this to be set up quickly and
% easily.
f7 = figure('position',[52 94 954 425]);
ax(1) = subplot(121);
subimage(y(:,:,1),redMap)
title('red intensity')
ax(2) = subplot(122);
subimage(y(:,:,2),greenMap)
title('green intensity')
f8 = figure('position',[316 34 482 497]);
ax(3) = get(imshow(y,'notruesize'),'parent');
title('gene expression')
for i=1:3
axes(ax(i))
axis off
line(xGrid'*[1 1],yGrid([1 end]),'color',0.5*[1 1 1])
line(xGrid([1 end]),yGrid'*[1 1],'color',0.5*[1 1 1])
end
%% Repeat measurement for all spots
% We now repeat the spot extraction and intensity calculation for all the
% spots in the grid. Here the measured values were tabulated as additional
% columns beside the ROI positions for each spot into a new matrix called
% |spotData|.
figure(f7), figure(f8)
spotData = [ROI zeros(length(ROI),5)];
for i=1:length(ROI)
spot = imcrop(y,ROI(i,:)); %raw image
spot2 = imtophat(spot,seDisk); %background removed
mask = imcrop(L,ROI(i,:))==i; %spot mask
for j=1:2
layer = spot2(:,:,j); %color layer
intensity(j) = double(median(layer(mask)));
text(ROI(i,1)+ROI(i,3)/2,ROI(i,2)+ROI(i,4)/2,sprintf('%.0f',intensity(j)),...
'color','y','HorizontalAlignment','center','parent',ax(j))
rawLayer = spot(:,:,j);
rawIntensity(j) = double(median(layer(mask)));
end
expression = log(intensity(1)/intensity(2));
text(ROI(i,1)+ROI(i,3)/2,ROI(i,2)+ROI(i,4)/2,sprintf('%.2f',expression),...
'color','w','HorizontalAlignment','center','parent',ax(3))
drawnow
spotData(i,5:9) = [intensity(:)' expression rawIntensity(:)'];
end
%% Export spot data to Excel spreadsheet
% MATLAB can write to many standard formats. We will use |xlswrite| to save
% the |spotData| to an Excel workbook.
%
% TODO list:
%
% * prepend column names first (see |xlswrite| doc example using cell arrays)
%
% * programmatically open spreadsheet in Excel (see |winopen| doc)
xlswrite('microarray.xls',spotData)
##### SOURCE END #####
-->
</body>
</html>