7 Feature Engineering for Image Segmentation
7.1 Why Feature Maps Matter
In a bladder-cytology segmentation project, the task is to label each pixel as nucleus, cytoplasm, or background. Once the nucleus and cytoplasm are separated, one can estimate the nucleus-to-cytoplasm (N/C) ratio, a key morphologic criterion in urinary cytology for assessing high-grade urothelial carcinoma.
The filters below are also the right conceptual bridge to convolutional neural networks. Most of them take a small neighborhood — a kernel sliding across the image — and produce a new feature map emphasizing a particular signal: smoothness, boundaries, or texture. CNNs generalize this idea by learning the best kernels from data. Studying handcrafted filters first makes that leap concrete.

7.2 The Kernel Convolution Widget
Before exploring specific filters, build intuition for how any linear filter works. The widget shows a 3×3 kernel sliding over a small image of pixel values (0–1). Edit the kernel weights and watch the output feature map update live.
How to use: Click any pixel to see the dot-product breakdown. Edit the nine kernel values to define any 3×3 filter. Use the preset buttons to load common filters. Click Animate to sweep the kernel row by row.
The key insight: changing the kernel weights changes what the feature map emphasizes. CNNs learn these weights automatically from data.
7.3 Raw Intensity and Color-Channel Maps

Before filtering, split the image into one or more base channels. In a grayscale image the base map is simply \(I(x,y)\). In a color image: \[I(x,y) = \big(R(x,y),\,G(x,y),\,B(x,y)\big)\]
A standard grayscale projection: \[Y(x,y)=0.2126\,R(x,y)+0.7152\,G(x,y)+0.0722\,B(x,y)\]
Converting to HSV yields \((H,\,S,\,V)\), making hue, saturation, and brightness separate candidate feature maps.
Why this matters for nucleus / cytoplasm / background:
- Nucleus often differs from cytoplasm by darkness or stain concentration.
- Cytoplasm may separate more cleanly in one channel than another.
- Background is often relatively uniform in value or saturation.
The first question in any segmentation pipeline: which channel already gives the cleanest visual separation?
7.4 Gaussian Blur and Mean Blur


These are the clearest examples of a kernel sliding across the image — exactly what the widget above demonstrates.
A mean blur with kernel \(K\) computes: \[F(x,y)=(K * I)(x,y)=\sum_{u=-r}^{r}\sum_{v=-r}^{r} K(u,v)\,I(x-u,y-v)\]
For a \(3\times 3\) mean filter: \[K_{\text{mean}}=\frac{1}{9} \begin{bmatrix} 1&1&1\\ 1&1&1\\ 1&1&1 \end{bmatrix}\]
A Gaussian blur uses a kernel whose values follow a 2D normal distribution: \[G_\sigma(u,v)=\frac{1}{2\pi\sigma^2}\exp\!\left(-\frac{u^2+v^2}{2\sigma^2}\right)\]
Why this matters:
- Reduces random pixel noise before stronger feature extraction.
- Produces a coarse low-frequency map that separates broad cell regions from background.
- Makes downstream edge maps more stable.
Too much blur weakens the boundaries you want to preserve. Apply conservatively, and always before edge detection.
7.5 Median and Rank Filters
Unlike mean and Gaussian blur, a median filter does not multiply and sum — it sorts the values in a local window and picks the middle one. This single difference makes it remarkably effective at removing isolated noise spikes without blurring edges.
7.5.1 How it works — step by step
For every output pixel, the filter:
- Places a 3×3 window centred on that pixel — collecting 9 values.
- Sorts those 9 values from smallest to largest.
- Returns the 5th value (rank 5 of 9) — the median.
\[F(x,y)=\operatorname{median}\{I(i,j):(i,j)\in W_{x,y}\}\]
Example — a salt spike (value 9) buried in the nucleus (value 1):
| Step | Values |
|---|---|
| Window contents | 1, 1, 1, 1, 9, 1, 1, 1, 1 |
| Sorted | 1, 1, 1, 1, 1, 1, 1, 1, 9 |
| Output (rank 5) | 1 ✓ spike removed |
A mean filter would give \(\frac{8\times1+9}{9} \approx 1.9\) — a residual artifact. The median gives exactly 1, because the spike is pushed to the end of the sorted list and never reaches rank 5.
Example — a pepper spike (value 0) in the bright background (value 8):
| Step | Values |
|---|---|
| Window contents | 8, 8, 8, 8, 0, 8, 8, 8, 8 |
| Sorted | 0, 8, 8, 8, 8, 8, 8, 8, 8 |
| Output (rank 5) | 8 ✓ spike removed |
7.5.2 Interactive widget
The grid below uses values 0–9 (0 = black, 9 = white), matching the real staining convention: nucleus = 1 (dark), cytoplasm = 5 (medium), background = 8 (bright). Two noise spikes are planted: a salt spike at row 3, col 4 (value 9 in the dark nucleus — click it first) and a pepper spike at row 7, col 3 (value 0 in the bright background). Click any pixel to see its 3×3 window, the sorted values, and the output. Then click Apply filter to compute the full output image.
7.5.3 Rank filters — a generalisation
A rank filter returns the k-th ordered value rather than always the median:
\[F(x,y)=\operatorname{rank}_k\{I(i,j):(i,j)\in W_{x,y}\}\]
| k | Name | Effect |
|---|---|---|
| 1 | Erosion (minimum) | Shrinks and darkens bright regions |
| 5 of 9 | Median | Removes spikes; preserves edges |
| 9 | Dilation (maximum) | Expands and brightens bright regions |
Why this matters:
- Excellent at suppressing impulse noise (salt-and-pepper artifacts).
- More edge-preserving than mean blur — the output is always one of the actual neighborhood values, never a blend.
- Cleans isolated specks in the background while keeping cell boundaries sharp.
Unlike mean and Gaussian blur, the median filter cannot be written as a convolution \(K * I\) with a fixed kernel — it is inherently nonlinear. This is why it does not appear as a preset in the kernel widget above.
7.6 Sobel / Scharr Gradient Magnitude

Sobel estimates local image derivatives using two fixed kernels:
\[K_x= \begin{bmatrix} -1&0&1\\ -2&0&2\\ -1&0&1 \end{bmatrix}, \qquad K_y= \begin{bmatrix} -1&-2&-1\\ 0&0&0\\ 1&2&1 \end{bmatrix}\]
Gradient components and magnitude: \[G_x = K_x * I,\quad G_y = K_y * I,\quad M(x,y)=\sqrt{G_x^2 + G_y^2}\]
Load Sobel X or Sobel Y in the widget to see how these kernels respond to edges in the example image.
Why this matters:
- Highlights nucleus boundaries and cell boundaries directly.
- Bright responses in \(M\) align with visible contours in the image.
- One of the most interpretable handcrafted maps for segmentation.
Derivative-based maps are noise-sensitive. Apply a Gaussian blur first.
7.7 Gabor Filters
A Gabor filter is a Gaussian-modulated sinusoid — it detects texture at a specific orientation and spatial frequency. Where Sobel detects any edge regardless of frequency, Gabor responds selectively to periodic patterns (e.g., chromatin texture in a nucleus).
7.7.1 From Formula to Matrix: Sampling a Continuous Function
A common point of confusion: the Gabor formula looks abstract, but it is just a recipe for filling in a kernel matrix. The same principle applies to the Gaussian kernel — both are continuous functions that you evaluate at discrete integer pixel offsets to produce the actual numbers in the sliding window.
For a 3×3 kernel, the offsets are \((a, b) \in \{-1, 0, +1\} \times \{-1, 0, +1\}\), where \(a\) is the row offset (negative = up, positive = down) and \(b\) is the column offset (negative = left, positive = right). Plug each pair into the formula to get the corresponding matrix entry:
\[K[a,\,b] = \underbrace{\exp\!\left(-\frac{x'^2 + \gamma^2 y'^2}{2\sigma^2}\right)}_{\text{Gaussian envelope}} \cdot \underbrace{\cos\!\left(\frac{2\pi\, x'}{\lambda}\right)}_{\text{sinusoidal wave}}\]
\[x' = a\cos\theta + b\sin\theta, \qquad y' = -a\sin\theta + b\cos\theta\]
Two pieces working together:
- The Gaussian envelope (\(\exp\) term) weights entries by distance from center — full weight at \((0,0)\), tapering toward the edges. This is exactly like a Gaussian blur kernel.
- The sinusoidal wave (\(\cos\) term) creates alternating positive and negative bands across the kernel. \(\lambda\) sets the band width; \(\theta\) controls which direction the bands run.
Multiply the two and you get a kernel that responds to a specific texture frequency at a specific orientation.
7.7.2 What θ Does: Rotating the Stripe Pattern
The angle \(\theta\) rotates the coordinate system before the cosine is applied. At θ = 0°, \(x' = a\) (the row direction), so the cosine varies across rows → horizontal bands. At θ = 90°, \(x' = b\) (the column direction), so the cosine varies across columns → vertical bands.
Here are the two kernels computed from the formula with \(\lambda=2\), \(\sigma=1\), \(\gamma=1\):
\[K_{0°} \approx \begin{bmatrix} -0.37 & -0.61 & -0.37 \\ 0.61 & 1.00 & 0.61 \\ -0.37 & -0.61 & -0.37 \end{bmatrix} \qquad K_{90°} \approx \begin{bmatrix} -0.37 & 0.61 & -0.37 \\ -0.61 & 1.00 & -0.61 \\ -0.37 & 0.61 & -0.37 \end{bmatrix}\]
\(K_{0°}\) fires when the center row is bright and the rows above and below are dark — a horizontal band crossing the kernel. \(K_{90°}\) fires when the center column is bright and the columns left and right are dark — a vertical band. Notice they are transposes of each other; rotating by 90° is equivalent to transposing the matrix.
For diagonal orientations (45°, 135°), the stripe pattern tilts, but a 3×3 grid is too coarse to display the diagonal clearly. A 5×5 kernel makes the orientation far more legible — which is why the widget below uses 5×5.
7.7.3 The Other Three Parameters
| Parameter | Concrete meaning | Smaller value | Larger value |
|---|---|---|---|
| \(\lambda\) | Stripe width (wavelength) | Tight stripes — detects fine texture | Wide stripes — detects coarse texture |
| \(\sigma\) | Gaussian bell width | Small window, few stripes visible | Large window, more stripes captured |
| \(\gamma\) | Aspect ratio of the Gaussian ellipse | Elongated along the stripe direction | More circular envelope |
A useful rule of thumb: keep \(\sigma \approx \lambda / \pi\) so that roughly one full stripe cycle fits within the Gaussian envelope.
7.7.4 Interactive Gabor Explorer
Edit the 10×10 image (click any cell to change its value), then drag the sliders to see the 5×5 kernel and its output feature map update live.
The default image has horizontal stripes — try θ = 0° first, then rotate to 90° and watch the output respond differently. Red cells in the output mean a strong positive match; blue cells mean a strong negative response (the inverse of the target texture).
How to use: Drag θ from 0° to 90° and watch the kernel’s stripe pattern rotate — the output’s bright-red regions shift accordingly. Try λ = 2 for tight stripes vs λ = 6 for wide ones. Increase σ to widen the Gaussian envelope and let more rows contribute. Click any input cell to type a new value (0.0–1.0), then draw your own texture and observe how the output responds.
Load the Gabor (0°) preset in the kernel convolution widget at the top of the chapter to see this same 3×3 version applied to the fixed example image.
7.7.5 Gabor Filter Banks
In practice a filter bank is used: multiple Gabor filters at several orientations (0°, 45°, 90°, 135°) and scales. Each filter produces one feature map. Together they form a rich multi-channel texture descriptor.
Why this matters for nucleus / cytoplasm / background:
- Nucleus has distinctive chromatin texture — fine-grained periodic patterns visible at specific orientations and frequencies.
- Cytoplasm is smoother and less structured.
- Background is largely uniform, responding weakly to all Gabor filters.
- A Gabor filter bank can distinguish these regions where Sobel or blur alone cannot.
Gabor filters are linear convolutions — they are fixed kernels applied via \(F = K_{\text{Gabor}} * I\). A CNN trained on texture data tends to learn filters that closely resemble Gabor kernels in its early layers.
7.8 Gray Level Co-Occurrence Matrix (GLCM)
The Gray Level Co-Occurrence Matrix (GLCM) is a fundamentally different kind of feature descriptor. It does not produce a feature map by convolution. Instead, it computes second-order statistics — how often specific pairs of intensity values appear together at a given spatial offset.
7.8.1 Building the GLCM
For a given offset \(\Delta = (\Delta r, \Delta c)\) and an image with \(N_g\) gray levels, the GLCM is a \(N_g \times N_g\) matrix:
\[C_{\Delta}(i,\,j) = \#\bigl\{(r,c) : I(r,c)=i \text{ and } I(r+\Delta r,\,c+\Delta c)=j\bigr\}\]
A small example with 4 gray levels and offset \(\Delta=(0,1)\) (one step to the right):
\[I = \begin{bmatrix}0&1&2\\1&2&3\\2&3&1\end{bmatrix} \qquad\Rightarrow\qquad C_{(0,1)} = \begin{bmatrix}0&1&0&0\\0&0&1&0\\0&0&0&2\\0&1&0&0\end{bmatrix}\]
Row \(i\), column \(j\) of \(C\) counts how many times intensity \(i\) is immediately to the left of intensity \(j\).
7.8.2 Haralick Features
The GLCM is rarely used directly. Instead, scalar Haralick features are extracted:
| Feature | Formula | What it captures |
|---|---|---|
| Contrast | \(\sum_{i,j}(i-j)^2\,\tilde{C}(i,j)\) | Local intensity variation |
| Energy | \(\sum_{i,j}\tilde{C}(i,j)^2\) | Textural uniformity |
| Homogeneity | \(\sum_{i,j}\frac{\tilde{C}(i,j)}{1+|i-j|}\) | Diagonal dominance |
| Correlation | \(\sum_{i,j}\frac{(i-\mu_i)(j-\mu_j)\tilde{C}(i,j)}{\sigma_i\sigma_j}\) | Linear gray-level dependency |
where \(\tilde{C}\) is the normalized GLCM (\(\sum_{i,j}\tilde{C}(i,j)=1\)).
7.8.3 Using GLCM for Segmentation
A single GLCM describes the whole image. For pixel-level segmentation, compute the GLCM in a sliding window (e.g., 15×15 pixels) centered on each pixel. Each window produces one GLCM, yielding one Haralick feature value per pixel — a new scalar feature map.
Why this matters for nucleus / cytoplasm / background:
- Nucleus: high contrast (chromatin granules), low homogeneity, high energy in certain orientations.
- Cytoplasm: lower contrast, more uniform, higher homogeneity.
- Background: very uniform, very high energy, very high homogeneity.
These differences make GLCM features among the most discriminative classical texture descriptors for cell segmentation.
GLCM features capture relationships between pixel pairs, not just individual pixel values. This is called a second-order statistic — Gabor and Sobel are first-order (they operate on single pixel values in a neighborhood). Both types are complementary.
7.9 From Feature Engineering to CNNs
The filters in this chapter form a clean progression:
- Raw channels — which intensity channel gives the best separation?
- Mean / Gaussian blur — the sliding kernel in its simplest form; smooths noise.
- Median filter — a nonlinear local statistic; impulse-noise resistant.
- Sobel — derivative kernels; detects where intensity changes rapidly.
- Gabor — Gaussian-modulated sinusoids; detects texture at specific orientations and frequencies.
- GLCM — second-order statistics; captures relationships between pixel pairs.
At that point the conceptual leap to CNNs is short:
A CNN layer still slides learned kernels over the image — but the kernels are trained from data rather than hand-designed. Early CNN layers learn filters that closely resemble Gaussian, Sobel, and Gabor kernels. Deeper layers combine these to detect higher-level structures.
This is exactly what Chapter 8 covers.
7.10 Domain Context: N/C Ratio and Segmentation
In urinary cytology, an elevated N/C ratio is a central criterion for assessing high-grade urothelial carcinoma. To estimate it computationally, the pipeline must separate nucleus from cytoplasm and both from background. These feature maps are tools for making those three regions more separable — not arbitrary filters.
7.11 Coding Exercises
#| exercise-id: ch7_ex_1.1
# Exercise 7.1: Mean filter
# Apply a 3x3 mean filter using scipy.ndimage.uniform_filter(size=3).
# Display the original and filtered images side-by-side.
import numpy as np
from scipy.ndimage import uniform_filter
import matplotlib.pyplot as plt
np.random.seed(42)
image = np.random.rand(64, 64) * 0.15
image[20:44, 20:44] += 0.75
image = np.clip(image, 0, 1)
# Write your code below:
#| exercise-id: ch7_ex_1.2
# Exercise 7.2: Sobel gradient magnitude
# Compute Gx, Gy, and the gradient magnitude using scipy.ndimage.sobel.
# Display the three maps side-by-side.
import numpy as np
from scipy.ndimage import sobel
import matplotlib.pyplot as plt
np.random.seed(42)
image = np.random.rand(64, 64) * 0.05
image[20:44, 20:44] = 0.9
# Write your code below:
#| exercise-id: ch7_ex_1.3
# Exercise 7.3: Median vs mean filter on impulse noise
# Add salt-and-pepper noise, then compare uniform_filter vs median_filter.
import numpy as np
from scipy.ndimage import uniform_filter, median_filter
import matplotlib.pyplot as plt
np.random.seed(0)
image = np.zeros((64, 64))
image[16:48, 16:48] = 0.8
noise_mask = np.random.rand(64, 64) < 0.05
image[noise_mask] = 1.0
# Write your code below:
#| exercise-id: ch7_ex_1.4
# Exercise 7.4: Gabor filter bank
# Apply skimage.filters.gabor at orientations 0, 45, 90, 135 degrees (theta in radians).
# Use frequency=0.2. Display the real part of each response.
import numpy as np
from skimage.filters import gabor
import matplotlib.pyplot as plt
np.random.seed(1)
image = np.random.rand(64, 64) * 0.05
# Add a "nucleus" region with horizontal texture
for i in range(20, 44, 4):
image[i, 20:44] = 0.85
# Write your code below:
#| exercise-id: ch7_ex_1.5
# Exercise 7.5: GLCM texture features
# Use skimage.feature.graycomatrix and graycoprops to compute
# contrast, energy, and homogeneity for two regions:
# (a) the bright square (nucleus-like), (b) the dark background.
# Compare the values.
import numpy as np
from skimage.feature import graycomatrix, graycoprops
np.random.seed(42)
image = np.random.rand(64, 64) * 0.15
image[20:44, 20:44] = np.random.rand(24, 24) * 0.4 + 0.55
image = (image * 255).astype(np.uint8)
# Write your code below:
# Hint: graycomatrix expects integer-valued image.
# Use distances=[1], angles=[0], levels=256, symmetric=True, normed=True