7  Feature Engineering for Image Segmentation

7.1 Why Feature Maps Matter

In a bladder-cytology segmentation project, the task is to label each pixel as nucleus, cytoplasm, or background. Once the nucleus and cytoplasm are separated, one can estimate the nucleus-to-cytoplasm (N/C) ratio, a key morphologic criterion in urinary cytology for assessing high-grade urothelial carcinoma.

The filters below are also the right conceptual bridge to convolutional neural networks. Most of them take a small neighborhood — a kernel sliding across the image — and produce a new feature map emphasizing a particular signal: smoothness, boundaries, or texture. CNNs generalize this idea by learning the best kernels from data. Studying handcrafted filters first makes that leap concrete.

A 3×3 kernel hovering over an image, computing a dot product at each pixel position.

7.2 The Kernel Convolution Widget

Before exploring specific filters, build intuition for how any linear filter works. The widget shows a 3×3 kernel sliding over a small image of pixel values (0–1). Edit the kernel weights and watch the output feature map update live.

How to use: Click any pixel to see the dot-product breakdown. Edit the nine kernel values to define any 3×3 filter. Use the preset buttons to load common filters. Click Animate to sweep the kernel row by row.

The key insight: changing the kernel weights changes what the feature map emphasizes. CNNs learn these weights automatically from data.


7.3 Raw Intensity and Color-Channel Maps

Color-channel feature maps

Before filtering, split the image into one or more base channels. In a grayscale image the base map is simply \(I(x,y)\). In a color image: \[I(x,y) = \big(R(x,y),\,G(x,y),\,B(x,y)\big)\]

A standard grayscale projection: \[Y(x,y)=0.2126\,R(x,y)+0.7152\,G(x,y)+0.0722\,B(x,y)\]

Converting to HSV yields \((H,\,S,\,V)\), making hue, saturation, and brightness separate candidate feature maps.

Why this matters for nucleus / cytoplasm / background:

  • Nucleus often differs from cytoplasm by darkness or stain concentration.
  • Cytoplasm may separate more cleanly in one channel than another.
  • Background is often relatively uniform in value or saturation.

The first question in any segmentation pipeline: which channel already gives the cleanest visual separation?


7.4 Gaussian Blur and Mean Blur

Example 3×3 Gaussian kernel

Example 3×3 mean kernel

These are the clearest examples of a kernel sliding across the image — exactly what the widget above demonstrates.

A mean blur with kernel \(K\) computes: \[F(x,y)=(K * I)(x,y)=\sum_{u=-r}^{r}\sum_{v=-r}^{r} K(u,v)\,I(x-u,y-v)\]

For a \(3\times 3\) mean filter: \[K_{\text{mean}}=\frac{1}{9} \begin{bmatrix} 1&1&1\\ 1&1&1\\ 1&1&1 \end{bmatrix}\]

A Gaussian blur uses a kernel whose values follow a 2D normal distribution: \[G_\sigma(u,v)=\frac{1}{2\pi\sigma^2}\exp\!\left(-\frac{u^2+v^2}{2\sigma^2}\right)\]

Why this matters:

  • Reduces random pixel noise before stronger feature extraction.
  • Produces a coarse low-frequency map that separates broad cell regions from background.
  • Makes downstream edge maps more stable.
Caution

Too much blur weakens the boundaries you want to preserve. Apply conservatively, and always before edge detection.


7.5 Median and Rank Filters

Unlike mean and Gaussian blur, a median filter does not multiply and sum — it sorts the values in a local window and picks the middle one. This single difference makes it remarkably effective at removing isolated noise spikes without blurring edges.

7.5.1 How it works — step by step

For every output pixel, the filter:

  1. Places a 3×3 window centred on that pixel — collecting 9 values.
  2. Sorts those 9 values from smallest to largest.
  3. Returns the 5th value (rank 5 of 9) — the median.

\[F(x,y)=\operatorname{median}\{I(i,j):(i,j)\in W_{x,y}\}\]

Example — a salt spike (value 9) buried in the nucleus (value 1):

Step Values
Window contents 1, 1, 1, 1, 9, 1, 1, 1, 1
Sorted 1, 1, 1, 1, 1, 1, 1, 1, 9
Output (rank 5) 1 ✓ spike removed

A mean filter would give \(\frac{8\times1+9}{9} \approx 1.9\) — a residual artifact. The median gives exactly 1, because the spike is pushed to the end of the sorted list and never reaches rank 5.

Example — a pepper spike (value 0) in the bright background (value 8):

Step Values
Window contents 8, 8, 8, 8, 0, 8, 8, 8, 8
Sorted 0, 8, 8, 8, 8, 8, 8, 8, 8
Output (rank 5) 8 ✓ spike removed

7.5.2 Interactive widget

The grid below uses values 0–9 (0 = black, 9 = white), matching the real staining convention: nucleus = 1 (dark), cytoplasm = 5 (medium), background = 8 (bright). Two noise spikes are planted: a salt spike at row 3, col 4 (value 9 in the dark nucleus — click it first) and a pepper spike at row 7, col 3 (value 0 in the bright background). Click any pixel to see its 3×3 window, the sorted values, and the output. Then click Apply filter to compute the full output image.

7.5.3 Rank filters — a generalisation

A rank filter returns the k-th ordered value rather than always the median:

\[F(x,y)=\operatorname{rank}_k\{I(i,j):(i,j)\in W_{x,y}\}\]

k Name Effect
1 Erosion (minimum) Shrinks and darkens bright regions
5 of 9 Median Removes spikes; preserves edges
9 Dilation (maximum) Expands and brightens bright regions

Why this matters:

  • Excellent at suppressing impulse noise (salt-and-pepper artifacts).
  • More edge-preserving than mean blur — the output is always one of the actual neighborhood values, never a blend.
  • Cleans isolated specks in the background while keeping cell boundaries sharp.
Note

Unlike mean and Gaussian blur, the median filter cannot be written as a convolution \(K * I\) with a fixed kernel — it is inherently nonlinear. This is why it does not appear as a preset in the kernel widget above.


7.6 Sobel / Scharr Gradient Magnitude

Sobel kernels

Sobel estimates local image derivatives using two fixed kernels:

\[K_x= \begin{bmatrix} -1&0&1\\ -2&0&2\\ -1&0&1 \end{bmatrix}, \qquad K_y= \begin{bmatrix} -1&-2&-1\\ 0&0&0\\ 1&2&1 \end{bmatrix}\]

Gradient components and magnitude: \[G_x = K_x * I,\quad G_y = K_y * I,\quad M(x,y)=\sqrt{G_x^2 + G_y^2}\]

Load Sobel X or Sobel Y in the widget to see how these kernels respond to edges in the example image.

Why this matters:

  • Highlights nucleus boundaries and cell boundaries directly.
  • Bright responses in \(M\) align with visible contours in the image.
  • One of the most interpretable handcrafted maps for segmentation.
Caution

Derivative-based maps are noise-sensitive. Apply a Gaussian blur first.


7.7 Gabor Filters

A Gabor filter is a Gaussian-modulated sinusoid — it detects texture at a specific orientation and spatial frequency. Where Sobel detects any edge regardless of frequency, Gabor responds selectively to periodic patterns (e.g., chromatin texture in a nucleus).

7.7.1 From Formula to Matrix: Sampling a Continuous Function

A common point of confusion: the Gabor formula looks abstract, but it is just a recipe for filling in a kernel matrix. The same principle applies to the Gaussian kernel — both are continuous functions that you evaluate at discrete integer pixel offsets to produce the actual numbers in the sliding window.

For a 3×3 kernel, the offsets are \((a, b) \in \{-1, 0, +1\} \times \{-1, 0, +1\}\), where \(a\) is the row offset (negative = up, positive = down) and \(b\) is the column offset (negative = left, positive = right). Plug each pair into the formula to get the corresponding matrix entry:

\[K[a,\,b] = \underbrace{\exp\!\left(-\frac{x'^2 + \gamma^2 y'^2}{2\sigma^2}\right)}_{\text{Gaussian envelope}} \cdot \underbrace{\cos\!\left(\frac{2\pi\, x'}{\lambda}\right)}_{\text{sinusoidal wave}}\]

\[x' = a\cos\theta + b\sin\theta, \qquad y' = -a\sin\theta + b\cos\theta\]

Two pieces working together:

  • The Gaussian envelope (\(\exp\) term) weights entries by distance from center — full weight at \((0,0)\), tapering toward the edges. This is exactly like a Gaussian blur kernel.
  • The sinusoidal wave (\(\cos\) term) creates alternating positive and negative bands across the kernel. \(\lambda\) sets the band width; \(\theta\) controls which direction the bands run.

Multiply the two and you get a kernel that responds to a specific texture frequency at a specific orientation.

7.7.2 What θ Does: Rotating the Stripe Pattern

The angle \(\theta\) rotates the coordinate system before the cosine is applied. At θ = 0°, \(x' = a\) (the row direction), so the cosine varies across rows → horizontal bands. At θ = 90°, \(x' = b\) (the column direction), so the cosine varies across columns → vertical bands.

Here are the two kernels computed from the formula with \(\lambda=2\), \(\sigma=1\), \(\gamma=1\):

\[K_{0°} \approx \begin{bmatrix} -0.37 & -0.61 & -0.37 \\ 0.61 & 1.00 & 0.61 \\ -0.37 & -0.61 & -0.37 \end{bmatrix} \qquad K_{90°} \approx \begin{bmatrix} -0.37 & 0.61 & -0.37 \\ -0.61 & 1.00 & -0.61 \\ -0.37 & 0.61 & -0.37 \end{bmatrix}\]

\(K_{0°}\) fires when the center row is bright and the rows above and below are dark — a horizontal band crossing the kernel. \(K_{90°}\) fires when the center column is bright and the columns left and right are dark — a vertical band. Notice they are transposes of each other; rotating by 90° is equivalent to transposing the matrix.

For diagonal orientations (45°, 135°), the stripe pattern tilts, but a 3×3 grid is too coarse to display the diagonal clearly. A 5×5 kernel makes the orientation far more legible — which is why the widget below uses 5×5.

7.7.3 The Other Three Parameters

Parameter Concrete meaning Smaller value Larger value
\(\lambda\) Stripe width (wavelength) Tight stripes — detects fine texture Wide stripes — detects coarse texture
\(\sigma\) Gaussian bell width Small window, few stripes visible Large window, more stripes captured
\(\gamma\) Aspect ratio of the Gaussian ellipse Elongated along the stripe direction More circular envelope

A useful rule of thumb: keep \(\sigma \approx \lambda / \pi\) so that roughly one full stripe cycle fits within the Gaussian envelope.

7.7.4 Interactive Gabor Explorer

Edit the 10×10 image (click any cell to change its value), then drag the sliders to see the 5×5 kernel and its output feature map update live.

The default image has horizontal stripes — try θ = 0° first, then rotate to 90° and watch the output respond differently. Red cells in the output mean a strong positive match; blue cells mean a strong negative response (the inverse of the target texture).

How to use: Drag θ from 0° to 90° and watch the kernel’s stripe pattern rotate — the output’s bright-red regions shift accordingly. Try λ = 2 for tight stripes vs λ = 6 for wide ones. Increase σ to widen the Gaussian envelope and let more rows contribute. Click any input cell to type a new value (0.0–1.0), then draw your own texture and observe how the output responds.

Load the Gabor (0°) preset in the kernel convolution widget at the top of the chapter to see this same 3×3 version applied to the fixed example image.

7.7.5 Gabor Filter Banks

In practice a filter bank is used: multiple Gabor filters at several orientations (0°, 45°, 90°, 135°) and scales. Each filter produces one feature map. Together they form a rich multi-channel texture descriptor.

Why this matters for nucleus / cytoplasm / background:

  • Nucleus has distinctive chromatin texture — fine-grained periodic patterns visible at specific orientations and frequencies.
  • Cytoplasm is smoother and less structured.
  • Background is largely uniform, responding weakly to all Gabor filters.
  • A Gabor filter bank can distinguish these regions where Sobel or blur alone cannot.
Note

Gabor filters are linear convolutions — they are fixed kernels applied via \(F = K_{\text{Gabor}} * I\). A CNN trained on texture data tends to learn filters that closely resemble Gabor kernels in its early layers.


7.8 Gray Level Co-Occurrence Matrix (GLCM)

The Gray Level Co-Occurrence Matrix (GLCM) is a fundamentally different kind of feature descriptor. It does not produce a feature map by convolution. Instead, it computes second-order statistics — how often specific pairs of intensity values appear together at a given spatial offset.

7.8.1 Building the GLCM

For a given offset \(\Delta = (\Delta r, \Delta c)\) and an image with \(N_g\) gray levels, the GLCM is a \(N_g \times N_g\) matrix:

\[C_{\Delta}(i,\,j) = \#\bigl\{(r,c) : I(r,c)=i \text{ and } I(r+\Delta r,\,c+\Delta c)=j\bigr\}\]

A small example with 4 gray levels and offset \(\Delta=(0,1)\) (one step to the right):

\[I = \begin{bmatrix}0&1&2\\1&2&3\\2&3&1\end{bmatrix} \qquad\Rightarrow\qquad C_{(0,1)} = \begin{bmatrix}0&1&0&0\\0&0&1&0\\0&0&0&2\\0&1&0&0\end{bmatrix}\]

Row \(i\), column \(j\) of \(C\) counts how many times intensity \(i\) is immediately to the left of intensity \(j\).

7.8.2 Haralick Features

The GLCM is rarely used directly. Instead, scalar Haralick features are extracted:

Feature Formula What it captures
Contrast \(\sum_{i,j}(i-j)^2\,\tilde{C}(i,j)\) Local intensity variation
Energy \(\sum_{i,j}\tilde{C}(i,j)^2\) Textural uniformity
Homogeneity \(\sum_{i,j}\frac{\tilde{C}(i,j)}{1+|i-j|}\) Diagonal dominance
Correlation \(\sum_{i,j}\frac{(i-\mu_i)(j-\mu_j)\tilde{C}(i,j)}{\sigma_i\sigma_j}\) Linear gray-level dependency

where \(\tilde{C}\) is the normalized GLCM (\(\sum_{i,j}\tilde{C}(i,j)=1\)).

7.8.3 Using GLCM for Segmentation

A single GLCM describes the whole image. For pixel-level segmentation, compute the GLCM in a sliding window (e.g., 15×15 pixels) centered on each pixel. Each window produces one GLCM, yielding one Haralick feature value per pixel — a new scalar feature map.

Why this matters for nucleus / cytoplasm / background:

  • Nucleus: high contrast (chromatin granules), low homogeneity, high energy in certain orientations.
  • Cytoplasm: lower contrast, more uniform, higher homogeneity.
  • Background: very uniform, very high energy, very high homogeneity.

These differences make GLCM features among the most discriminative classical texture descriptors for cell segmentation.

Note

GLCM features capture relationships between pixel pairs, not just individual pixel values. This is called a second-order statistic — Gabor and Sobel are first-order (they operate on single pixel values in a neighborhood). Both types are complementary.


7.9 From Feature Engineering to CNNs

The filters in this chapter form a clean progression:

  1. Raw channels — which intensity channel gives the best separation?
  2. Mean / Gaussian blur — the sliding kernel in its simplest form; smooths noise.
  3. Median filter — a nonlinear local statistic; impulse-noise resistant.
  4. Sobel — derivative kernels; detects where intensity changes rapidly.
  5. Gabor — Gaussian-modulated sinusoids; detects texture at specific orientations and frequencies.
  6. GLCM — second-order statistics; captures relationships between pixel pairs.

At that point the conceptual leap to CNNs is short:

A CNN layer still slides learned kernels over the image — but the kernels are trained from data rather than hand-designed. Early CNN layers learn filters that closely resemble Gaussian, Sobel, and Gabor kernels. Deeper layers combine these to detect higher-level structures.

This is exactly what Chapter 8 covers.


7.10 Domain Context: N/C Ratio and Segmentation

In urinary cytology, an elevated N/C ratio is a central criterion for assessing high-grade urothelial carcinoma. To estimate it computationally, the pipeline must separate nucleus from cytoplasm and both from background. These feature maps are tools for making those three regions more separable — not arbitrary filters.


7.11 Coding Exercises

#| exercise-id: ch7_ex_1.1
# Exercise 7.1: Mean filter
# Apply a 3x3 mean filter using scipy.ndimage.uniform_filter(size=3).
# Display the original and filtered images side-by-side.

import numpy as np
from scipy.ndimage import uniform_filter
import matplotlib.pyplot as plt

np.random.seed(42)
image = np.random.rand(64, 64) * 0.15
image[20:44, 20:44] += 0.75
image = np.clip(image, 0, 1)

# Write your code below:
#| exercise-id: ch7_ex_1.2
# Exercise 7.2: Sobel gradient magnitude
# Compute Gx, Gy, and the gradient magnitude using scipy.ndimage.sobel.
# Display the three maps side-by-side.

import numpy as np
from scipy.ndimage import sobel
import matplotlib.pyplot as plt

np.random.seed(42)
image = np.random.rand(64, 64) * 0.05
image[20:44, 20:44] = 0.9

# Write your code below:
#| exercise-id: ch7_ex_1.3
# Exercise 7.3: Median vs mean filter on impulse noise
# Add salt-and-pepper noise, then compare uniform_filter vs median_filter.

import numpy as np
from scipy.ndimage import uniform_filter, median_filter
import matplotlib.pyplot as plt

np.random.seed(0)
image = np.zeros((64, 64))
image[16:48, 16:48] = 0.8
noise_mask = np.random.rand(64, 64) < 0.05
image[noise_mask] = 1.0

# Write your code below:
#| exercise-id: ch7_ex_1.4
# Exercise 7.4: Gabor filter bank
# Apply skimage.filters.gabor at orientations 0, 45, 90, 135 degrees (theta in radians).
# Use frequency=0.2. Display the real part of each response.

import numpy as np
from skimage.filters import gabor
import matplotlib.pyplot as plt

np.random.seed(1)
image = np.random.rand(64, 64) * 0.05
# Add a "nucleus" region with horizontal texture
for i in range(20, 44, 4):
    image[i, 20:44] = 0.85

# Write your code below:
#| exercise-id: ch7_ex_1.5
# Exercise 7.5: GLCM texture features
# Use skimage.feature.graycomatrix and graycoprops to compute
# contrast, energy, and homogeneity for two regions:
# (a) the bright square (nucleus-like), (b) the dark background.
# Compare the values.

import numpy as np
from skimage.feature import graycomatrix, graycoprops

np.random.seed(42)
image = np.random.rand(64, 64) * 0.15
image[20:44, 20:44] = np.random.rand(24, 24) * 0.4 + 0.55
image = (image * 255).astype(np.uint8)

# Write your code below:
# Hint: graycomatrix expects integer-valued image.
# Use distances=[1], angles=[0], levels=256, symmetric=True, normed=True
Sign in to save progress
My Progress

0 / 0

📚 Gradebook

Loading…

✏️ Speed Grader

Sign in to save progress