2 Image Representation, Thresholding, and Noise

2.1 Introduction to Image Interpolation and Matplotlib Visualization

Image interpolation answers a fundamental question: How do we resize an image to different dimensions?

When enlarging an image (e.g., 10×10 → 20×20), new pixel values must be estimated. When reducing (e.g., 20×20 → 10×10), information from multiple pixels is summarized into single output pixels. Both tasks require interpolation—estimating values at positions where we don’t have direct measurements.

Several interpolation methods exist, each with different tradeoffs in quality, speed, and smoothness. This section explores the most commonly used approaches with visual demonstrations and hands-on code examples.

About the # @needs: comments in code cells

Some code cells begin with a comment like # @needs: ch2-setup. These are dependency hints — they identify which earlier cell must be run first. If you click Run Code on a cell whose dependency hasn’t been run yet, a yellow warning banner will appear instead of an error, telling you which cell to find and run first. You can safely ignore these comments when reading the code.

2.1.1 Creating and Inspecting Test Images

Let’s start with a simple artificial image that has clear structure, making it easy to see how different interpolation methods affect the output.

Let’s understand the structure of our test image:

The test image has a clear structure: concentric squares with decreasing values from the edge (white, 1.0) toward the center (black, 0.0). This structure makes it easy to visually evaluate how well each interpolation method preserves edges and transitions.

Figure 2.1.0: The 10×10 test image displayed as a heatmap with explicit pixel values shown in each cell. Notice that the image is very small—only 10 pixels wide and 10 pixels tall. When we want to enlarge this image, we need to estimate what values should exist at all the positions in between these discrete measurements.

Want to see how pixel values map to brightness? Click any cell in the grid below, type a new value between 0.0 and 1.0, and press Enter to see the color update live:

2.1.2 Understanding matplotlib’s Display Functions

In matplotlib, every visualization lives inside a figure—think of it as a blank canvas. The figure holds one or more axes objects, where each axis is an individual panel you can draw on. When working with images in a multi-panel layout, the standard pattern is to create both at once with plt.subplots():

fig, axes = plt.subplots(1, 2, figsize=(10, 4))

This creates a figure with a 1×2 grid of panels. figsize=(10, 4) sets the canvas width and height in inches. The function returns two things: fig (the canvas itself) and axes (an array of panel objects). With a single row, you access each panel as axes[0], axes[1], and so on. With a 2D grid, you use two indices: axes[0, 1] means row 0, column 1.

Each axis exposes a small set of methods you’ll call constantly when displaying images:

axes[0].imshow(image, cmap='gray')   # display the array as an image
axes[0].set_title("My Image")         # label the panel
axes[0].axis('off')                   # hide tick marks and borders

Here’s a complete working example using our test image:

plt.tight_layout() adjusts spacing between panels so that titles don’t overlap. Notice we call .imshow() and .set_title() on the axis object, not on plt directly. This is the object-oriented style and is standard practice whenever you have more than one subplot. The plt.imshow() shorthand still works for a single image, but once you start comparing things side by side—which comes up constantly in image processing—the axis-based approach scales naturally to any grid size.

Shorthand notation

Matplotlib also has a compact alternative for quick, exploratory work: plt.subplot(ABC). The three digits encode (rows, columns, position), so plt.subplot(121) means “1 row, 2 columns, panel 1.” Here’s the same example written as shorthand:

plt.figure(figsize=(8, 4))

plt.subplot(121)
plt.imshow(image, cmap='gray')
plt.title("Original")
plt.axis('off')

plt.subplot(122)
plt.imshow(1 - image, cmap='gray')
plt.title("Inverted")
plt.axis('off')

plt.tight_layout()
plt.show()

Both styles produce identical output. The shorthand saves a few keystrokes when sketching something quickly; plt.subplots() is easier to manage when your layout has more than two or three panels.

Quiz: Matplotlib Display Functions

Which call correctly creates a figure with 1 row and 2 columns of subplots and returns the figure and an array of axes?

fig, axes = plt.subplots(1, 2, figsize=(10, 4))

fig, axes = plt.subplot(1, 2, figsize=(10, 4))

fig = plt.figure(1, 2, figsize=(10, 4))

fig, axes = plt.subplots(figsize=(10, 4))

2.1.3 Data Types and Display Ranges

imshow() behaves differently depending on the data type of your array. Matplotlib understands two common conventions:

float arrays (float32, float64): values are expected to be in [0, 1]. Anything above 1.0 clips to white; anything below 0.0 clips to black.
uint8 arrays: values are expected to be in [0, 255].

This mismatch is one of the most common silent bugs in image code. If you load an image as uint8 (values 0–255) and then do some computation that returns a float array—but forget to divide by 255—you end up with floats like 150.0. Matplotlib clips everything above 1.0 to white with no error message, and your image appears as a blank white rectangle.

The safe rule: before calling imshow(), know your dtype and value range. If your array is float, ensure values are in [0, 1]. Use np.clip(arr, 0, 1) when in doubt—it’s inexpensive and prevents these silent display errors.

Quiz: Data Types and Display

A float32 array contains values ranging from 0 to 200. What will imshow() display?

A mostly white image — all values above 1.0 clip to white

A correctly scaled image — matplotlib auto-scales float arrays to their full range

A black image — values above 1.0 clip to black

An error — imshow() raises an exception for out-of-range floats

2.1.4 Array Shape and Color Channels

imshow() infers how to display your array from its shape:

Shape	Interpretation
`(H, W)`	Grayscale — displayed with a colormap
`(H, W, 3)`	RGB color — red, green, blue channels
`(H, W, 4)`	RGBA — RGB plus a per-pixel transparency channel

A shape of (H, W, 1) will raise an error. Matplotlib expects either no trailing dimension (grayscale) or exactly 3 or 4 channels. This trips people up often when working with deep learning frameworks, which frequently add a channel dimension to everything. If your array has this shape, squeeze it first with image.squeeze() or image[:, :, 0] before passing to imshow().

For grayscale images, always pass cmap='gray' explicitly. Without it, matplotlib applies its default colormap (viridis), which maps low values to dark purple and high values to yellow—visually misleading for any grayscale medical image.

Quiz: Array Shapes

What happens when you pass an array with shape (H, W, 1) directly to imshow()?

matplotlib raises an error — a trailing dimension of 1 is not accepted

It displays as a grayscale image — imshow() squeezes single-channel arrays automatically

It displays as an RGB image by duplicating the channel to R, G, and B

It displays as an RGBA image with the single channel used for the alpha component

2.1.5 Overlaying Segmentation Masks

One of the most important visualization patterns in medical image analysis is overlaying a segmentation mask on top of the original image. The technique is to display the image first, then call imshow() a second time on the same axis with the alpha parameter, which controls transparency: 0.0 is fully transparent (invisible), 1.0 is fully opaque.

You can call imshow() multiple times on the same axis—each call layers over the previous one. The base image is fully opaque; the mask sits on top at 50% transparency so the underlying cell structure shows through.

One issue with the overlay above: background pixels (where mask == 0) also receive a color from the Reds colormap, which partially washes out the original image. To show only the foreground while leaving background pixels completely transparent, use a masked array:

This pattern—grayscale base image with a colored semi-transparent mask on top—is standard in every segmentation workflow. You’ll use it throughout this course whenever you want to check that a model’s predicted cell boundaries actually align with the image.

Quiz: Segmentation Mask Overlay

You want to display a segmentation mask over a grayscale image so that only foreground pixels (mask == 1) are colored — background pixels should be completely transparent. Which approach achieves this?

Create a masked array with numpy.ma.masked_where(mask == 0, mask), then call imshow() on it

Call imshow(mask, alpha=0.5) directly on the original mask

Call imshow(mask, cmap=‘Reds’, vmin=1, vmax=1) to exclude background values

Subtract the mask from the original image before calling imshow()

2.1.6 Saving Figures

Once you’ve built a visualization you want to keep, plt.savefig() writes it to disk. The file format is inferred from the extension:

fig, ax = plt.subplots(figsize=(6, 6))
ax.imshow(image, cmap='gray')
ax.set_title("Saved Figure")
ax.axis('off')

plt.tight_layout()
plt.savefig('output.png', dpi=150, bbox_inches='tight')
plt.savefig('output.pdf')   # vector format — preferred for papers
plt.show()

Two parameters matter most:

dpi (dots per inch): controls output resolution. 72 dpi is screen quality; 150–300 dpi suits reports and print.
bbox_inches='tight': trims extra whitespace around the figure. Almost always what you want.

One ordering gotcha: call plt.savefig() before plt.show(). Once the figure is displayed and closed, matplotlib discards it, and a subsequent savefig() produces a blank image.

2.2 Interpolation Methods

2.2.1 The Core Problem: Filling the Gaps

When we resize an image, we’re asking: “What pixel values should exist at positions that weren’t in the original data?” The original 10×10 image contains discrete measurements at specific grid locations. When we enlarge it—say, from 10×10 to 40×40—we need to invent values for the 1,500 new pixels that sit between the original 100. When we shrink it, we need to summarize multiple pixels into one.

The same question arises in two different contexts you’ll encounter constantly:

Displaying an image: when matplotlib renders a small array on a large screen, it needs to fill in screen pixels between data pixels.
Resizing an array: when you preprocess images for a neural network with cv2.resize() or skimage.transform.resize(), you’re computing an entirely new array at a different resolution.

Both contexts use the same underlying algorithms. The difference between them matters and we’ll come back to it after working through the methods themselves.

Figure 2.1.0a: The 10×10 test image with explicit pixel values. Each cell is one pixel. Enlarging this image means inventing values for all the gaps between these discrete measurements.

2.2.2 What is a Kernel?

Every interpolation method works by sliding a small window—the kernel—to each position where we need an estimate. The kernel looks at some number of surrounding original pixels, assigns each a weight, and returns a weighted average as the estimated value. The methods differ only in how many neighbors they consult and how the weights are distributed.

To see this concretely, let’s pull out a 3×3 patch from the boundary region of our test image—where values actually change:

  (1,1)=0.8  (1,2)=0.8  (1,3)=0.6
  (2,1)=0.8  (2,2)=0.6  (2,3)=0.6
  (3,1)=0.8  (3,2)=0.6  (3,3)=0.4

Suppose we’re doubling this image and need to estimate a value for a new pixel that sits at the midpoint between the four pixels in the top-left 2×2 block: (1,1)=0.8, (1,2)=0.8, (2,1)=0.8, (2,2)=0.6. Here is how each method answers that question.

2.2.3 Nearest Neighbor Interpolation

The simplest approach: find the single closest original pixel and copy its value. No blending, no averaging—just a lookup.

For a new pixel at the midpoint between (1,1)=0.8 and (2,2)=0.6, nearest neighbor snaps to the closest grid point and returns that one value. The other three neighbors are ignored entirely.

This preserves exact original values and is the fastest method, but creates hard jumps at boundaries. When you enlarge an image this way, each original pixel expands into a solid block of identical values, producing the characteristic staircase or pixelated appearance.

Figure 2.1.1: Nearest neighbor interpolation on the 10×10 test image rendered at a larger display size. Each original pixel becomes a solid block, creating visible staircase edges at value boundaries.

The right time to use nearest neighbor is when your data represents discrete labels rather than continuous measurements—for example, a segmentation mask where value 1 means “cell” and value 2 means “nucleus.” Blending those labels would produce meaningless fractional values like 1.4.

Quiz: Nearest Neighbor Interpolation

A segmentation mask assigns labels 0 (background), 1 (cell), or 2 (nucleus) to each pixel. If you resize this mask using bilinear interpolation, what is the main problem?

Fractional values like 1.5 appear at label boundaries, which are meaningless as category labels

The resize will fail — bilinear interpolation only works with float arrays, not integer labels

Colors assigned to each label will shift because bilinear blending changes the colormap mapping

The mask shape will change from (H, W) to (H, W, 3)

2.2.4 Bilinear Interpolation

Instead of a single winner, bilinear blends the four nearest neighbors using distance-based weights: closer neighbors contribute more, farther neighbors contribute less.

For a new pixel sitting exactly halfway between all four corners, every distance is equal, so all weights are 0.25:

The answer is 0.75—a smooth blend rather than a hard snap. If the output position were closer to (2,2)=0.6, its weight would be larger and the result would shift toward 0.6. The general formula for a point at fractional offset \((dx, dy)\) from the top-left corner is:

\[\text{output} = (1-dx)(1-dy)\,p_{00} + dx(1-dy)\,p_{10} + (1-dx)dy\,p_{01} + dx \cdot dy\,p_{11}\]

Bilinear interpolation is the standard default for most image processing. It’s fast and produces smooth, artifact-free results without visible blocks.

2.2.5 Bicubic Interpolation

Bilinear looks at 4 neighbors and fits a linear surface through them. Bicubic extends this idea to 16 neighbors (a 4×4 grid) and fits a smooth cubic surface. The larger neighborhood means the output not only matches surrounding pixel values but also respects their rate of change—transitions through edges stay smooth in a way that linear blending can’t achieve.

The cubic weighting function gives small negative weights to the outermost ring of neighbors. This produces slight sharpening at edges and is why bicubic often preserves structural boundaries better than bilinear—particularly useful for medical images where cell edges matter.

The tradeoff is computation: 16 lookups and a more complex weight formula versus 4 for bilinear. For display and publication figures, bicubic is the better choice. For preprocessing pipelines that resize millions of images in a training loop, bilinear is usually the pragmatic default.

\[\text{output}(x,y) = \sum_{i=-1}^{2} \sum_{j=-1}^{2} w(i) \cdot w(j) \cdot \text{input}_{i,j}\]

where \(w(t)\) is a cubic polynomial chosen to be smooth and continuous at every point.

2.2.6 Gaussian and Lanczos

These two methods are available as imshow() options. Gaussian interpolation weights neighbors by a bell curve, producing very soft output—effectively upsampling plus blur. Lanczos uses a windowed sinc function that minimizes ringing artifacts and tends to produce the sharpest result of any standard method. Both are slower than bicubic and rarely worth the overhead for routine image processing work. Knowing they exist and what they do visually is enough for most purposes.

2.2.7 Summary

Method	Neighbors	Best for
Nearest	1	Discrete labels, segmentation masks
Bilinear	4 (2×2)	General images, ML preprocessing
Bicubic	16 (4×4)	Medical/scientific images, publication figures
Gaussian	~16	Soft visualization (display only)
Lanczos	~16–64	Highest-quality display (display only)

The fundamental tradeoff: more neighbors produce smoother, higher-quality results but require more computation.

2.2.8 `imshow()` vs. Actually Resizing an Array

This distinction is worth stating clearly because it trips people up constantly.

When you write plt.imshow(image, interpolation='bilinear'), you are giving matplotlib a rendering hint—a suggestion for how it should stretch the array’s pixels to fill your screen. The numpy array itself is not touched. Its shape is identical before and after:

To produce a new array at different dimensions, you use cv2.resize() or skimage.transform.resize(). The same three interpolation algorithms apply—now they’re computing real pixel values stored in memory rather than just rendering to a screen.

Note

cv2 (OpenCV) is not available in the browser environment. Run this code locally in Python to see the output.

import cv2

print(f"Original: {image.shape}\n")

# cv2.resize takes (width, height) — note: OPPOSITE of numpy's (height, width)
up_nearest  = cv2.resize(image, (40, 40), interpolation=cv2.INTER_NEAREST)
up_bilinear = cv2.resize(image, (40, 40), interpolation=cv2.INTER_LINEAR)
up_bicubic  = np.clip(cv2.resize(image, (40, 40), interpolation=cv2.INTER_CUBIC), 0, 1)

print(f"Upsampled (10×10 → 40×40): {up_nearest.shape}")

# INTER_AREA is recommended for downsampling — averages over source regions
# to avoid aliasing artifacts that bilinear/bicubic can introduce when shrinking
down = cv2.resize(image, (5, 5), interpolation=cv2.INTER_AREA)
print(f"Downsampled (10×10 → 5×5): {down.shape}")

Warning

cv2 axis order: cv2.resize(image, (width, height)) takes width first. NumPy arrays store data as (height, width), so image.shape returns (height, width). Passing image.shape directly to cv2.resize() swaps rows and columns—a common bug with non-square images.

# Compare actual array upsampling: display with interpolation='nearest'
# so matplotlib doesn't add a second layer of smoothing on top
fig, axes = plt.subplots(1, 4, figsize=(14, 4))

axes[0].imshow(image, cmap='gray', interpolation='nearest')
axes[0].set_title(f"Original\n{image.shape}")
axes[0].axis('off')

for ax, arr, label in zip(
    axes[1:],
    [up_nearest, up_bilinear, up_bicubic],
    ['INTER_NEAREST', 'INTER_LINEAR\n(bilinear)', 'INTER_CUBIC\n(bicubic)']
):
    ax.imshow(arr, cmap='gray', interpolation='nearest')
    ax.set_title(f"cv2.resize → {arr.shape}\n{label}")
    ax.axis('off')

plt.suptitle("Actual array upsampling: 10×10 → 40×40", fontsize=12, y=1.02)
plt.tight_layout()
plt.show()

2.2.9 Comparing Methods Visually

Now that you understand how each algorithm works, here is the same 10×10 array displayed at screen size by the three main imshow() interpolation options. All three panels represent identical data—only the screen rendering differs.

2.3 Colormaps for Data Visualization

A colormap is a mapping from numerical values to colors. It’s essential for visualizing grayscale, single-channel, or multi-valued data as color images. Matplotlib provides comprehensive built-in colormaps, each suited to different types of data and visualization goals.

2.3.1 Colormap Categories

Perceptually Uniform Sequential: These colormaps are designed to have an evenly changing luminance (brightness) throughout their range. They are good for representing data where continuity and order are important. Use these for scientific data where accurate perception of magnitude is critical.

viridis, plasma, inferno, magma, cividis, twilight, twilight_shifted, turbo

Sequential: These colormaps are also for ordered data, often representing quantities from low to high. They may not be perceptually uniform but are intuitive.

Warm to cold: Greys, Purples, Blues, Greens, Oranges, Reds
Multi-step: YlOrBr, YlOrRd, OrRd, PuBuGn, PuBu, GnBu, BuGn, YlGnBu, YlGn

Sequential (2): Another set of sequential colormaps, often with a slightly different aesthetic.

binary, gist_yarg, gist_gray, gray, bone, pink, spring, summer, autumn, winter, cool, Wistia, hot, afmhot, gist_heat, copper

Diverging: These colormaps are used when data has a meaningful mid-point (e.g., zero) and diverges in two directions (e.g., positive/negative, above/below average). Essential for showing deviations from a reference value.

PiYG, PRGn, BrBG, PuOr, RdGy, RdBu, RdYlBu, RdYlGn, Spectral, coolwarm, bwr, seismic

Cyclic: These colormaps are for data that wraps around, like phase angles or directions. The beginning and end of the colormap have the same color.

twilight, twilight_shifted, hsv

Qualitative: Used for discrete categories, where no ordering or relationship between categories is implied. Good for segmentation masks and categorical labels.

Pastel1, Pastel2, Paired, Accent, Dark2, Set1, Set2, Set3, tab10, tab20, tab20b, tab20c

Miscellaneous: Other colormaps that don’t fit neatly into the above categories.

flag, prism, ocean, gist_earth, terrain, gist_stern, gnuplot, gnuplot2, CMRmap, cubehelix, brg, gist_rainbow, rainbow, jet, nipy_spectral, gist_ncar

2.3.2 Choosing the Right Colormap

For scientific data and publication, prefer perceptually uniform colormaps:

For data with a meaningful center (like differences from zero), use diverging colormaps:

For discrete categories (segmentation masks), use qualitative colormaps:

Quiz: Choosing a Colormap

You are visualizing a difference image where pixel values range from −0.3 to +0.3, with zero meaning no change. Which colormap and settings are most appropriate?

A diverging colormap like coolwarm or RdBu with vmin=-0.3, vmax=0.3

A sequential colormap like viridis, since it is perceptually uniform

The jet colormap, because it spans the widest visual color range

The gray colormap, so that the visualization is colorblind-safe

2.3.3 Colormap Examples Gallery

2.3.4 Important Considerations

Always include a colorbar when publishing visualizations—it allows readers to interpret the actual values, not just relative differences.

Fix vmin and vmax when comparing multiple images so all images use the same color scale:

Avoid jet colormap for scientific data—it’s not perceptually uniform and can mislead viewers. Use viridis, plasma, or inferno instead.

Quiz: Colorbars and Comparison

You are displaying two images side by side and want a single shared colorbar. Which call places one colorbar that steals space equally from both axes?

fig.colorbar(im1, ax=axes, label=‘Intensity’) where axes is the list of both axis objects

plt.colorbar(im1) after both imshow() calls

axes[0].colorbar(im1) — colorbars are attached directly to individual axis objects

Call fig.colorbar() twice, once per axis, using the same im1

Interactive: Colormap & Display Range Explorer

vmax=1.00

vmin=0.00

Colormap: vmin: 0.00 vmax: 1.00

Try: raise vmin to 0.6 to clip dark regions to the colormap minimum. Lower vmax to 0.4 to saturate bright regions to the colormap maximum.

Python equivalent — updates as you adjust the controls:

fig, ax = plt.subplots(figsize=(5, 5))
im = ax.imshow(image, cmap='viridis', vmin=0.00, vmax=1.00)
plt.colorbar(im, ax=ax)
plt.show()

2.4 Grayscale Conversion

Color images store three intensity values per pixel — red, green, and blue. For many segmentation tasks, including urothelial cell analysis, color information isn’t what separates a cell from background: intensity structure is. Converting to grayscale reduces a 3-channel image to a single channel, cutting memory and computation by two-thirds while preserving the structural information algorithms depend on.

2.4.1 Manual Conversion: Luminance Weights

Human eyes are not equally sensitive to all colors — we perceive green as brightest, red as mid-bright, and blue as dim. A simple average of R, G, B channels would over-weight blue and under-weight green. The standard ITU-R BT.601 formula uses perceptually derived weights:

\[\text{gray} = 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B\]

Let’s apply this manually to a 5×5 toy RGB image:

Notice the top-left pixel (0.9, 0.1, 0.1) — heavily red — becomes 0.299×0.9 + 0.587×0.1 + 0.114×0.1 ≈ 0.339. The bottom-right pixel (0.0, 0.1, 0.9) — heavily blue — becomes only 0.114×0.9 ≈ 0.121. Blue pixels appear much darker in grayscale than their visual brightness might suggest.

2.4.2 Using a Python Library

skimage.color.rgb2gray applies the BT.709 standard (used in modern HD displays), with slightly different weights:

\[\text{gray} = 0.2126 \cdot R + 0.7152 \cdot G + 0.0722 \cdot B\]

2.4.3 BT.601 vs BT.709 — Does It Matter?

Both standards agree on the essentials: green carries most of the luminance (~59–72%), red is secondary (~21–30%), blue contributes least (~7–11%). Pixel-level differences are typically under 0.03 — invisible to the eye.

Standard	Red	Green	Blue	Used in
BT.601	0.299	0.587	0.114	Standard-def video, older libraries
BT.709	0.2126	0.7152	0.0722	HD displays, skimage, modern pipelines

For urothelial cell images the choice is rarely critical. What matters is consistency: if you train a model on BT.709 grayscale and feed it BT.601 grayscale at inference time there’s a systematic mismatch across every pixel. Pick one standard and use it throughout your pipeline.

2.4.4 Why Grayscale for Urothelial Cell Images?

Urothelial cell microscopy images are often stained (hematoxylin and eosin), so color does carry some signal — nuclei stain purple, cytoplasm stains pink. Despite this:

Segmentation algorithms (thresholding, morphological operations, CNNs) typically operate on single-channel intensity maps
Grayscale reduces memory by 3× — important when batching hundreds of 512×512 images
Intensity, not hue, drives boundary detection — the sharp transition from dark nucleus to lighter cytoplasm is what matters

For this course we work in grayscale unless color is explicitly needed.

Quiz: Grayscale Conversion

A pixel has RGB values (0.0, 1.0, 0.0) — pure green at full brightness. Using BT.601 weights (0.299, 0.587, 0.114), what is its grayscale value?

0.333 — simple average of R, G, B

0.587 — the green weight alone

1.0 — green is displayed at full brightness

0.114 — the blue weight is applied to green

2.5 Thresholding

Thresholding is the act of applying a comparison to an image array — the result is a boolean mask: an array of True/False values with the same shape as the image. Each True marks a pixel that satisfied the condition; each False marks one that didn’t.

This is one of the most fundamental operations in image analysis. In segmentation, it’s how you isolate cells from background, measure areas, and compute statistics over specific regions.

2.5.1 Boolean Operations on Arrays

The simplest threshold asks: which pixels are exactly zero?

label == 0 produces a boolean array — the same shape as label — with True at every background pixel. Printing .astype(int) converts True→1 and False→0 so the grid is easy to read.

Notice fg_mask is the exact complement of bg_mask here because the only values are 0, 1, and 2. You can verify:

2.5.2 Visualizing Masks

Boolean masks display cleanly with imshow — True renders white, False black:

You can also select a single cell by combining conditions:

Quiz: Binary Masks

Given the label array above, what does np.sum(label == 0) return?

The sum of all pixel values in the label image

The number of unique labels (3)

The count of background pixels (where label == 0)

The total number of pixels in the image (42)

2.5.3 Aggregating and Computing Statistics on Masked Regions

A boolean mask used as an index extracts only the pixels where the mask is True, returning them as a flat array. This lets you compute statistics on a specific region of the image.

The foreground pixels cluster around 0.65–0.75 (bright cells) while the background sits near 0.05 (dark). This intensity separation is exactly what allows thresholding to work as a segmentation strategy.

You can also compute per-cell statistics by masking for each label individually:

Quiz: Thresholding Statistics

You have a grayscale image img and a boolean mask mask. What does np.mean(img[mask]) compute?

The mean pixel value of the entire image

The fraction of pixels where the mask is True

The number of True pixels in the mask

The mean pixel value of only the pixels where mask is True

2.6 Noise and Denoising Techniques

2.6.1 Why Noise Matters in Computer Vision

Real-world images are never perfect. Every sensor, camera, or imaging device introduces noise—random variations in pixel values that obscure the true underlying signal. Noise comes from multiple sources:

Sensor noise: Thermal variations in camera sensors (especially important in low-light conditions)
Quantization noise: Rounding errors when converting continuous signals to discrete pixel values
Transmission noise: Corruption during data transmission or storage
Environmental factors: Vibration, atmospheric interference, electromagnetic interference

In medical imaging, computer vision, and machine learning, understanding and handling noise is critical. Algorithms trained on noisy data may learn the noise patterns instead of the true signal, degrading their ability to generalize to new, clean data. Conversely, aggressive noise removal can destroy important details.

2.6.2 Types of Noise and Generation Methods

Different types of noise require different analysis and denoising approaches. Let’s explore the most common noise models:

Gaussian Noise

Gaussian (or white) noise is the most commonly used noise model. It assumes each pixel’s noise follows a normal distribution with mean 0 and standard deviation σ.

How it works: np.random.normal(mean, std, shape) generates values from a normal distribution. With mean=0, each pixel gets a random deviation centered around zero.

Use cases: - Simulating camera sensor noise - Thermal noise in electronics - Most common assumption in image processing

Now let’s add noise to the test image and visualize the effect:

Comparing noisy versions at different noise levels:

Interactive: Noise Explorer

Original (clean)

Noisy

Noise type: Strength: σ = 0.00

Python equivalent — updates as you adjust the controls:

noise = np.random.normal(0, 0.00, image.shape)
noisy = np.clip(image + noise, 0, 1)

plt.imshow(noisy, cmap='gray')
plt.show()

Other Noise Generation Methods

1. Poisson Noise (photon noise) - Inherent in photon counting processes - Variance equals the mean intensity - More realistic for low-light images - Generated with np.random.poisson()

2. Salt-and-Pepper Noise (impulse noise) - Random pixels set to min (0) or max (255) values - Common in data transmission errors - Created by randomly replacing pixels with extreme values

3. Uniform Noise - All random values equally likely within a range - Less realistic than Gaussian but simpler - Generated with np.random.uniform()

4. Speckle Noise (multiplicative noise) - Multiplies each pixel value by a random factor - Common in radar and ultrasound imaging - Characterized as signal-dependent

2.6.3 Summary of Noise Models

Noise Type	Generation	Characteristics	Common Applications
Gaussian	`np.random.normal()`	Independent, signal-independent	General camera noise, simulations
Poisson	`np.random.poisson()`	Signal-dependent variance	Low-light photography, counting detectors
Salt-and-Pepper	Random extremes	Discrete impulses	Transmission errors, corrupted data
Uniform	`np.random.uniform()`	All values equally likely	Quantization effects, theoretical models
Speckle	Multiplicative	Signal-dependent, correlated	Radar, ultrasound, SAR imagery

Quiz: Noise Types

Speckle noise is described as “multiplicative.” What does this mean for a bright region (pixel value ≈ 1.0) compared to a dark region (pixel value ≈ 0.1)?

The bright region experiences much larger absolute noise variation than the dark region

Only bright pixels are corrupted; pixels close to zero remain unchanged

The noise magnitude is identical for all pixels regardless of their brightness

Dark regions experience larger noise; bright regions are close to their max and are unaffected

2.6.4 Denoising Methods

Now that we understand how noise is generated, let’s explore methods to remove it. Different denoising techniques have different strengths, and the choice depends on the noise type and desired preservation of image details.

Gaussian Filtering (Blur-based Denoising)

The simplest denoising approach is Gaussian filtering, which smooths the image by averaging neighboring pixels with a Gaussian-weighted kernel. This works well for Gaussian noise but will blur fine details.

Advantages: Fast, simple, works reasonably well for Gaussian noise
Disadvantages: Blurs edges and fine details significantly
Best for: Mild Gaussian noise where edge preservation is not critical

Median Filtering

Median filtering replaces each pixel with the median value of its neighbors. This is particularly effective for salt-and-pepper noise and impulse noise, while preserving edges better than Gaussian filtering.

Advantages: Preserves edges better, excellent for impulse noise
Disadvantages: Can remove fine structures, computational cost increases with filter size
Best for: Salt-and-pepper noise, impulse noise

Quiz: Median Filtering

Which noise type is median filtering most effective against, and why?

Salt-and-pepper noise — extreme outlier values (0 or 255) are never the median, so they get replaced

Gaussian noise — the median equals the mean for symmetric distributions, making it equivalent to averaging

Speckle noise — multiplicative patterns are cancelled out when taking the median across neighbors

Poisson noise — because photon counts are integers and the median of integers is always an integer

Non-Local Means Denoising

Non-local means (NLM) is a state-of-the-art denoising method that searches for similar patches across the image and averages them. This preserves edges and fine details much better than simple filtering.

Advantages: Excellent edge preservation, highest quality results, works for most noise types
Disadvantages: Slower (though fast_mode=True helps), requires parameter tuning
Best for: High-quality denoising when computational cost is acceptable

2.6.5 Comparison of Denoising Methods

Method	Speed	Quality	Edge Preservation	Best For
Gaussian Filter	⚡⚡⚡	⭐⭐	⭐	Mild noise, speed-critical
Median Filter	⚡⚡	⭐⭐⭐	⭐⭐⭐	Impulse noise, salt-and-pepper
Non-Local Means	⚡	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	High-quality results, any noise type

2.7 Unsharp Masking: Enhancing Image Contrast and Sharpness

Video Tutorial by DigitalScreeni: What is unsharp mask?

While denoising removes unwanted noise, unsharp masking is a complementary technique that enhances image sharpness by amplifying edges and fine details. Despite its counterintuitive name, unsharp masking doesn’t actually “unsharpen”—instead, it creates a blurred copy of the image and subtracts it from the original, emphasizing high-frequency details (edges).

2.7.1 What Unsharp Masking Does

Unsharp masking works by the following principle:

Create a blurred version of the original image (typically using Gaussian blur)
Subtract the blurred version from the original
Add back a scaled amount of this difference to the original

This creates an edge enhancement effect without introducing noise artifacts like simple high-pass filters. It’s particularly useful in medical imaging where edge contrast is diagnostically important.

2.7.2 Libraries and Basic Parameters

Unsharp masking is available in multiple libraries:

Key parameters to adjust:

radius (σ): Standard deviation of the Gaussian blur. Larger values sharpen broader features; smaller values sharpen fine details. Typical range: 0.5-3.0
amount: Strength of the sharpening effect. Controls how much of the edge information is added back. Typical range: 0.5-2.0
preserve_range: If True, keeps output in the same range as input (important for uint8 images)

2.7.3 Implementation on Toy Images

Let’s start with simple test images to understand the effect:

Now let’s apply unsharp masking with different parameters:

Quiz: Unsharp Masking

With amount=1.0, what does unsharp masking compute, and what is the visual effect?

original + 1.0 × (original − blurred) — edges and fine details are amplified

original × blurred — multiplying by the blurred version smooths the image

original − 1.0 × blurred — subtracting the blurred version softens the image

blurred + 1.0 × original — adding the original on top of the blur sharpens low frequencies

3 Exercises

3.1 Exercise 2.1: Image Interpolation and Colormaps

Objective: Understand how interpolation methods and colormaps affect image visualization.

3.1.1 Problem Setup

You have a small 6×6 image showing a circular gradient pattern:

import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_filter

# Create a 6x6 circular gradient image
image = np.array([
    [0.0, 0.2, 0.4, 0.4, 0.2, 0.0],
    [0.2, 0.5, 0.8, 0.8, 0.5, 0.2],
    [0.4, 0.8, 1.0, 1.0, 0.8, 0.4],
    [0.4, 0.8, 1.0, 1.0, 0.8, 0.4],
    [0.2, 0.5, 0.8, 0.8, 0.5, 0.2],
    [0.0, 0.2, 0.4, 0.4, 0.2, 0.0]
], dtype=np.float32)

print(f"Original image shape: {image.shape}")
print(f"Value range: [{image.min():.2f}, {image.max():.2f}]")

Original image shape: (6, 6)
Value range: [0.00, 1.00]

3.1.2 Tasks

Part A: Display with Different Interpolation Methods

Create a 1×3 grid of subplots displaying the image using three interpolation methods: 'nearest', 'bilinear', and 'bicubic'. Use figsize=(12, 4).

Which method shows the smoothest transitions?
Which method preserves the sharpest edges?
What visual artifacts appear with nearest neighbor?

Part B: Explore Colormaps

Display the same image (use 'bilinear' interpolation) using three different colormaps: 'viridis', 'hot', and 'coolwarm'. Create a 1×3 subplot grid.

Which colormap makes the bright center most visually prominent?
Which colormap is better for scientific publication (typically grayscale-friendly)?

Part C: Combine Interpolation and Denoising

Apply Gaussian smoothing to the image, then display both the original and smoothed versions side-by-side using 'bicubic' interpolation and the 'gray' colormap.

from scipy.ndimage import gaussian_filter

smoothed = gaussian_filter(image, sigma=0.5)

Does the smoothing enhance or reduce the circular pattern?
How would you choose the smoothing parameter (sigma) in a real application?

3.2 Exercise 2.2: Adding and Removing Noise

Objective: Understand noise models and denoising techniques.

3.2.1 Problem Setup

You have a clean 8×8 synthetic image (a simple checkerboard pattern):

import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import median_filter, gaussian_filter

# Create a simple checkerboard pattern
image = np.array([
    [1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0],
    [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0],
    [1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0],
    [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0],
    [1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0],
    [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0],
    [1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0],
    [0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0]
], dtype=np.float32)

print(f"Clean image shape: {image.shape}")
print(f"Value range: [{image.min():.1f}, {image.max():.1f}]")

Clean image shape: (8, 8)
Value range: [0.0, 1.0]

3.2.2 Tasks

Part A: Add Gaussian Noise

Add Gaussian noise with mean=0 and standard deviation=0.15 to the image. Display the clean and noisy versions side-by-side.

np.random.seed(42)
gaussian_noise = np.random.normal(0, 0.15, image.shape)
noisy_image = np.clip(image + gaussian_noise, 0, 1)

What percentage of pixels are clipped to [0, 1]?
How does the noise affect the visual appearance of the checkerboard pattern?

Part B: Denoise with Gaussian Filter

Apply Gaussian filtering (σ=0.5) to remove the noise. Compare the denoised image with the original clean image.

Does the denoised image match the original checkerboard pattern?
What details are lost or blurred?

Part C: Denoise with Median Filter

Apply median filtering (kernel size 3×3) to the noisy image. Compare with the Gaussian filter result.

denoised_median = median_filter(noisy_image, size=3)

Which denoising method preserves the sharp edges of the checkerboard better?
Why might median filtering be preferred for this type of noise?

📌 Solution: Exercise 2.2

Solution 2.3: Adding and Removing Noise

Part A: Add Gaussian Noise

np.random.seed(42)
gaussian_noise = np.random.normal(0, 0.15, image.shape)
noisy_image = np.clip(image + gaussian_noise, 0, 1)

# Display clean and noisy
fig, axes = plt.subplots(1, 2, figsize=(10, 4))

axes[0].imshow(image, cmap='gray', interpolation='nearest')
axes[0].set_title("Clean Checkerboard")
axes[0].axis('off')

axes[1].imshow(noisy_image, cmap='gray', interpolation='nearest')
axes[1].set_title("With Gaussian Noise (σ=0.15)")
axes[1].axis('off')

plt.tight_layout()
plt.show()

# Check clipping
clipped_count = np.sum((image + gaussian_noise < 0) | (image + gaussian_noise > 1))
clipped_pct = 100 * clipped_count / image.size
print(f"Clipped pixels: {clipped_count} ({clipped_pct:.2f}%)")

Clipped pixels: 40 (62.50%)

Answers: - Percentage clipped: ~0-2% (most noise stays within [-0.45, 0.45], so rarely clips) - Visual effect: Sharp checkerboard becomes fuzzy; edges blur; high-frequency pattern partially obscured

Part B: Denoise with Gaussian Filter

from scipy.ndimage import gaussian_filter

denoised_gaussian = gaussian_filter(noisy_image, sigma=0.5)

fig, axes = plt.subplots(1, 3, figsize=(12, 4))

axes[0].imshow(image, cmap='gray', interpolation='nearest')
axes[0].set_title("Original Clean")
axes[0].axis('off')

axes[1].imshow(noisy_image, cmap='gray', interpolation='nearest')
axes[1].set_title("Noisy")
axes[1].axis('off')

axes[2].imshow(denoised_gaussian, cmap='gray', interpolation='nearest')
axes[2].set_title("Gaussian Denoised (σ=0.5)")
axes[2].axis('off')

plt.tight_layout()
plt.show()

# Quantify difference
mse_gaussian = np.mean((denoised_gaussian - image)**2)
print(f"MSE vs original: {mse_gaussian:.4f}")

MSE vs original: 0.1045

Answers: - Match to original: Reasonably close, but edges are noticeably blurred - Details lost: Sharp transitions between black and white checkerboard are softened into gray gradients

Part C: Denoise with Median Filter

from scipy.ndimage import median_filter

denoised_median = median_filter(noisy_image, size=3)

fig, axes = plt.subplots(1, 3, figsize=(12, 4))

axes[0].imshow(denoised_gaussian, cmap='gray', interpolation='nearest')
axes[0].set_title("Gaussian Filtered")
axes[0].axis('off')

axes[1].imshow(denoised_median, cmap='gray', interpolation='nearest')
axes[1].set_title("Median Filtered")
axes[1].axis('off')

axes[2].imshow(image, cmap='gray', interpolation='nearest')
axes[2].set_title("Original")
axes[2].axis('off')

plt.tight_layout()
plt.show()

# Compare both
mse_median = np.mean((denoised_median - image)**2)
print(f"Gaussian MSE: {mse_gaussian:.4f}")
print(f"Median MSE: {mse_median:.4f}")

Gaussian MSE: 0.1045
Median MSE: 0.3075

Answers: - Edge preservation: Median filter preserves sharp checkerboard edges much better than Gaussian - Why median is preferred: Median filter is a non-linear filter that preserves discontinuities (edges) while removing noise. For images with sharp boundaries (like medical images, text), median is superior to Gaussian blur which smooths everything indiscriminately. Median is especially effective for impulse/salt-and-pepper noise

3.3 Exercise 2.3: Thresholding and Binary Masks

Objective: Understand how boolean comparisons on image arrays produce binary masks, and how masks can be used to isolate regions of interest.

3.3.1 Problem Setup

You have a 10×10 grayscale image with pixel intensities in the range [0, 1]:

import numpy as np
import matplotlib.pyplot as plt

image = np.array([
    [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
    [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
    [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.9],
    [0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.9, 0.8],
    [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.9, 0.8, 0.7],
    [0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 0.9, 0.8, 0.7, 0.6],
    [0.6, 0.7, 0.8, 0.9, 1.0, 0.9, 0.8, 0.7, 0.6, 0.5],
    [0.7, 0.8, 0.9, 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4],
    [0.8, 0.9, 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3],
    [0.9, 1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2],
], dtype=np.float32)

print(f"Image shape: {image.shape}")
print(f"Value range: [{image.min():.1f}, {image.max():.1f}]")

Image shape: (10, 10)
Value range: [0.0, 1.0]

The intensities peak at 1.0 along the main diagonal and decrease toward the corners.

3.3.2 Tasks

Part A: Boolean Comparisons

Run each of the following three comparisons and print the result:

print(image == 1)
print(image == 0)
print(image < 0.3)

For each result, answer:

What dtype does the resulting array have?
How many pixels satisfy the condition? (Hint: use .sum())
What geometric pattern do the True values form?

Part B: Build a Binary Mask with np.zeros

Create a binary mask that flags all pixels with intensity > 0.7:

mask = np.zeros(image.shape, dtype=np.float32)
mask[image > 0.7] = 1.0
print(mask)

Then display the original image and the mask side-by-side:

fig, axes = plt.subplots(1, 2, figsize=(8, 4))
axes[0].imshow(image, cmap='gray', vmin=0, vmax=1)
axes[0].set_title("Original Image")
axes[0].axis('off')
axes[1].imshow(mask, cmap='gray', vmin=0, vmax=1)
axes[1].set_title("Mask (intensity > 0.7)")
axes[1].axis('off')
plt.tight_layout()
plt.show()

What shape/region does the mask highlight?
How many pixels are set to 1.0?

Part C: Apply the Mask

Multiply the original image by the mask to keep only the bright region:

masked_image = image * mask

Display all three side-by-side: original, mask, and masked image.

What happens to pixels where the mask is 0?
How could this technique help isolate a cell from a microscopy image?

📌 Solution: Exercise 2.3

Solution 2.3: Thresholding and Binary Masks

Part A: Boolean Comparisons

print("image == 1:")
print(image == 1)
print(f"  dtype: {(image == 1).dtype},  True count: {(image == 1).sum()}\n")

print("image == 0:")
print(image == 0)
print(f"  dtype: {(image == 0).dtype},  True count: {(image == 0).sum()}\n")

print("image < 0.3:")
print(image < 0.3)
print(f"  dtype: {(image < 0.3).dtype},  True count: {(image < 0.3).sum()}")

image == 1:
[[False False False False False False False False False False]
 [False False False False False False False False False  True]
 [False False False False False False False False  True False]
 [False False False False False False False  True False False]
 [False False False False False False  True False False False]
 [False False False False False  True False False False False]
 [False False False False  True False False False False False]
 [False False False  True False False False False False False]
 [False False  True False False False False False False False]
 [False  True False False False False False False False False]]
  dtype: bool,  True count: 9

image == 0:
[[ True False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]]
  dtype: bool,  True count: 1

image < 0.3:
[[ True  True  True False False False False False False False]
 [ True  True False False False False False False False False]
 [ True False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False False]
 [False False False False False False False False False  True]]
  dtype: bool,  True count: 7

Answers:

dtype: bool — numpy boolean comparisons always return a boolean array.
image == 1: 10 True pixels (one per row, along the anti-diagonal). Equality checks find only exact matches.
image == 0: 1 True pixel (top-left corner, value 0.0). All other corners have values > 0.
image < 0.3: pixels with values 0.0, 0.1, or 0.2 — the low-intensity corner region.
Pattern: True values trace the diagonal intensity structure of the image.

Part B: Build a Binary Mask

mask = np.zeros(image.shape, dtype=np.float32)
mask[image > 0.7] = 1.0
print(mask)
print(f"\nPixels set to 1.0: {int(mask.sum())}")

fig, axes = plt.subplots(1, 2, figsize=(8, 4))
axes[0].imshow(image, cmap='gray', vmin=0, vmax=1)
axes[0].set_title("Original Image")
axes[0].axis('off')
axes[1].imshow(mask, cmap='gray', vmin=0, vmax=1)
axes[1].set_title("Mask (intensity > 0.7)")
axes[1].axis('off')
plt.tight_layout()
plt.show()

[[0. 0. 0. 0. 0. 0. 0. 0. 1. 1.]
 [0. 0. 0. 0. 0. 0. 0. 1. 1. 1.]
 [0. 0. 0. 0. 0. 0. 1. 1. 1. 1.]
 [0. 0. 0. 0. 0. 1. 1. 1. 1. 1.]
 [0. 0. 0. 0. 1. 1. 1. 1. 1. 0.]
 [0. 0. 0. 1. 1. 1. 1. 1. 0. 0.]
 [0. 0. 1. 1. 1. 1. 1. 0. 0. 0.]
 [0. 1. 1. 1. 1. 1. 0. 0. 0. 0.]
 [1. 1. 1. 1. 1. 0. 0. 0. 0. 0.]
 [1. 1. 1. 1. 0. 0. 0. 0. 0. 0.]]

Pixels set to 1.0: 43

Answers:

The mask highlights a diagonal band of bright pixels in the upper-right region of the image.
np.zeros initialises every pixel to 0; boolean indexing mask[image > 0.7] = 1.0 then sets only the matching pixels to 1.

Part C: Apply the Mask

masked_image = image * mask

fig, axes = plt.subplots(1, 3, figsize=(12, 4))
axes[0].imshow(image, cmap='gray', vmin=0, vmax=1)
axes[0].set_title("Original")
axes[0].axis('off')
axes[1].imshow(mask, cmap='gray', vmin=0, vmax=1)
axes[1].set_title("Mask")
axes[1].axis('off')
axes[2].imshow(masked_image, cmap='gray', vmin=0, vmax=1)
axes[2].set_title("Masked Image")
axes[2].axis('off')
plt.tight_layout()
plt.show()

Answers:

Pixels where mask == 0 become exactly 0 (black) in the result — they are completely suppressed.
In medical imaging this is the standard way to isolate a region of interest: threshold on intensity (or a predicted probability map), build a binary mask, then multiply to zero out background pixels before further analysis.

3.4 Exercise 2.4: Unsharp Masking and Real Medical Images (Urothelial Cells)

Objective: Apply denoising and sharpening techniques to real microscopy images, comparing multiple methods for edge preservation and contrast enhancement.

3.4.1 Problem Setup

In this exercise, you’ll work with real urothelial cell images from the Cedar Sinai dataset. The images contain natural noise from the microscope acquisition and would benefit from both denoising and contrast enhancement.

Before starting: Follow these setup steps in your terminal:

# Clone the repo
!git clone https://github.com/emilsar/VocEd.git
%cd VocEd

Then load the data in your notebook:

import glob
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_filter, median_filter
from skimage.filters import unsharp_mask
from skimage.restoration import denoise_nl_means


# ── Load data (already float32, channels-first) ──────────────────────────────

N = len(glob.glob("imagedata/X/*.npy"))

X = np.stack([np.load(f"imagedata/X/{i}.npy") for i in range(N)])   # (N, 3, 256, 256), float32
y = np.stack([np.load(f"imagedata/y/{i}.npy") for i in range(N)])   # (N, 256, 256), int64

print(f"X  shape : {X.shape}   dtype: {X.dtype}")
print(f"y  shape : {y.shape}   dtype: {y.dtype}")
print(f"Pixel range in X : [{X.min():.3f}, {X.max():.3f}]")
print(f"Unique labels in y: {np.unique(y)}")

3.4.2 Tasks

Part A: Visual Comparison of Denoising Methods

Apply five different image processing techniques to the original image:

Gaussian Filter (σ=1.0): Simple blur-based denoising
Median Filter (size=5): Edge-preserving denoising
Non-Local Means (h=0.1): High-quality patch-based denoising
Unsharp Masking (radius=1.0, amount=1.0): Edge enhancement
Combined (NLM + Unsharp): Denoise first, then sharpen

Create a 2×3 subplot grid displaying: - Row 0: Original, Gaussian, Median - Row 1: Non-Local Means, Unsharp Masking, Combined

Use cmap='gray' for all images.

Which method produces the sharpest cellular features?
Which method removes noise most effectively?
How does combining denoising + sharpening compare to each individual method?

Part B: Quantitative Comparison

For each of the five denoised/processed images, compute and print: - Mean pixel value - Standard deviation - Min and max values - Estimate of “sharpness” using the Laplacian variance (high variance = sharp, low variance = blurry)

from scipy.ndimage import laplace

# Compute Laplacian variance as a sharpness metric
laplacian = laplace(image)
sharpness = np.var(laplacian)
print(f"Sharpness (Laplacian variance): {sharpness:.4f}")

Sharpness (Laplacian variance): 0.0168

Create a summary table comparing all methods.

Part C: Tuning Unsharp Masking Parameters

Apply unsharp masking with three different parameter sets: - Mild: radius=0.5, amount=0.5 - Moderate: radius=1.0, amount=1.0 - Strong: radius=1.5, amount=2.0

Display these three versions side-by-side. Then answer:

How does increasing the radius parameter affect the result?
How does increasing the amount parameter affect edge enhancement?
Which parameter set best preserves cellular detail while enhancing contrast?

Submit your Google Colab notebook link below.

Before submitting, make your notebook accessible:
Share → Restricted (dropdown) → Anyone with the link
Then set role: Viewer (dropdown) → Commenter
This allows your instructor to comment on your work.

📌 Solution: Exercise 2.4

Solution 2.4: Unsharp Masking and Real Medical Images

Part A: Visual Comparison

# Apply different processing methods
denoised_gaussian = gaussian_filter(original_image, sigma=1.0)
denoised_median = median_filter(original_image, size=5)
denoised_nlm = denoise_nl_means(original_image, h=0.1, fast_mode=True, patch_size=5, patch_distance=7)
sharpened_only = unsharp_mask(original_image, radius=1.0, amount=1.0)
denoised_combined = denoise_nl_means(original_image, h=0.1, fast_mode=True, patch_size=5, patch_distance=7)
sharpened_combined = unsharp_mask(denoised_combined, radius=0.8, amount=1.0)

# Create 2x3 comparison grid
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
fig.suptitle('Denoising and Sharpening Comparison: Urothelial Cell Image', 
             fontsize=14, fontweight='bold')

axes[0, 0].imshow(original_image, cmap='gray')
axes[0, 0].set_title("Original (Noisy)")
axes[0, 0].axis('off')

axes[0, 1].imshow(denoised_gaussian, cmap='gray')
axes[0, 1].set_title("Gaussian Filter\n(σ=1.0)")
axes[0, 1].axis('off')

axes[0, 2].imshow(denoised_median, cmap='gray')
axes[0, 2].set_title("Median Filter\n(size=5)")
axes[0, 2].axis('off')

axes[1, 0].imshow(denoised_nlm, cmap='gray')
axes[1, 0].set_title("Non-Local Means\n(Best Quality)")
axes[1, 0].axis('off')

axes[1, 1].imshow(sharpened_only, cmap='gray')
axes[1, 1].set_title("Unsharp Masking\n(radius=1.0, amount=1.0)")
axes[1, 1].axis('off')

axes[1, 2].imshow(sharpened_combined, cmap='gray')
axes[1, 2].set_title("NLM Denoised\n+ Sharpened")
axes[1, 2].axis('off')

plt.tight_layout()
plt.show()

Answers: - Sharpest cellular features: Combined method (NLM + Unsharp) provides best detail with enhanced contrast - Best noise removal: Non-Local Means removes noise most effectively while preserving edges - Combined approach advantage: First denoising removes noise artifacts, then sharpening enhances edges without amplifying residual noise

Part B: Quantitative Comparison

from scipy.ndimage import laplace

# Compute statistics for each method
methods = {
    'Original': original_image,
    'Gaussian': denoised_gaussian,
    'Median': denoised_median,
    'Non-Local Means': denoised_nlm,
    'Unsharp Masking': sharpened_only,
    'NLM + Sharp': sharpened_combined
}

print("\nStatistical Comparison:")
print(f"{'Method':<20} {'Mean':<10} {'Std':<10} {'Min':<10} {'Max':<10} {'Sharpness':<12}")
print("-" * 70)

for name, image in methods.items():
    mean_val = np.mean(image)
    std_val = np.std(image)
    min_val = np.min(image)
    max_val = np.max(image)
    
    # Compute sharpness via Laplacian variance
    laplacian = laplace(image)
    sharpness = np.var(laplacian)
    
    print(f"{name:<20} {mean_val:<10.4f} {std_val:<10.4f} {min_val:<10.4f} {max_val:<10.4f} {sharpness:<12.6f}")

Answers: The combined method typically shows highest sharpness while maintaining reasonable noise levels. Non-Local Means has lowest standard deviation (noise reduced), while unsharp masking increases sharpness metric but may amplify noise.

Part C: Tuning Unsharp Masking

# Apply with different parameters
unsharp_mild = unsharp_mask(original_image, radius=0.5, amount=0.5)
unsharp_moderate = unsharp_mask(original_image, radius=1.0, amount=1.0)
unsharp_strong = unsharp_mask(original_image, radius=1.5, amount=2.0)

fig, axes = plt.subplots(1, 3, figsize=(15, 5))
fig.suptitle('Unsharp Masking Parameter Tuning', fontsize=12, fontweight='bold')

axes[0].imshow(unsharp_mild, cmap='gray')
axes[0].set_title("Mild\n(radius=0.5, amount=0.5)")
axes[0].axis('off')

axes[1].imshow(unsharp_moderate, cmap='gray')
axes[1].set_title("Moderate\n(radius=1.0, amount=1.0)")
axes[1].axis('off')

axes[2].imshow(unsharp_strong, cmap='gray')
axes[2].set_title("Strong\n(radius=1.5, amount=2.0)")
axes[2].axis('off')

plt.tight_layout()
plt.show()

Answers: - Radius effect: Larger radius sharpens broader features; smaller radius enhances fine details. For cellular images, radius=0.8-1.0 is optimal - Amount effect: Higher amounts create more dramatic edge enhancement; too high causes halo artifacts and noise amplification - Optimal for cells: Moderate parameters (radius=1.0, amount=0.8-1.0) provide good contrast without artifacts

2.1 Introduction to Image Interpolation and Matplotlib Visualization

2.1.1 Creating and Inspecting Test Images

2.1.2 Understanding matplotlib’s Display Functions

Shorthand notation

Quiz: Matplotlib Display Functions

2.1.3 Data Types and Display Ranges

Quiz: Data Types and Display

2.1.4 Array Shape and Color Channels

Quiz: Array Shapes

2.1.5 Overlaying Segmentation Masks

Quiz: Segmentation Mask Overlay

2.1.6 Saving Figures

2.2 Interpolation Methods

2.2.1 The Core Problem: Filling the Gaps

2.2.2 What is a Kernel?

2.2.3 Nearest Neighbor Interpolation

Quiz: Nearest Neighbor Interpolation

2.2.4 Bilinear Interpolation

2.2.5 Bicubic Interpolation

2.2.6 Gaussian and Lanczos

2.2.7 Summary

2.2.8 imshow() vs. Actually Resizing an Array

2.2.9 Comparing Methods Visually

2.3 Colormaps for Data Visualization

2.3.1 Colormap Categories

2.3.2 Choosing the Right Colormap

Quiz: Choosing a Colormap

2.3.3 Colormap Examples Gallery

2.3.4 Important Considerations

Quiz: Colorbars and Comparison

Interactive: Colormap & Display Range Explorer

2.4 Grayscale Conversion

2.4.1 Manual Conversion: Luminance Weights

2.4.2 Using a Python Library

2.4.3 BT.601 vs BT.709 — Does It Matter?

2.4.4 Why Grayscale for Urothelial Cell Images?

Quiz: Grayscale Conversion

2.5 Thresholding

2.5.1 Boolean Operations on Arrays

2.5.2 Visualizing Masks

Quiz: Binary Masks

2.5.3 Aggregating and Computing Statistics on Masked Regions

Quiz: Thresholding Statistics

2.6 Noise and Denoising Techniques

2.6.1 Why Noise Matters in Computer Vision

2.6.2 Types of Noise and Generation Methods

Gaussian Noise

Interactive: Noise Explorer

Other Noise Generation Methods

2.6.3 Summary of Noise Models

Quiz: Noise Types

2.6.4 Denoising Methods

Gaussian Filtering (Blur-based Denoising)

Median Filtering

Quiz: Median Filtering

Non-Local Means Denoising

2.6.5 Comparison of Denoising Methods

2.7 Unsharp Masking: Enhancing Image Contrast and Sharpness

2.7.1 What Unsharp Masking Does

2.7.2 Libraries and Basic Parameters

2.7.3 Implementation on Toy Images

Quiz: Unsharp Masking

3 Exercises

3.1 Exercise 2.1: Image Interpolation and Colormaps

3.1.1 Problem Setup

3.1.2 Tasks

Solution 2.2: Image Interpolation and Colormaps

3.2 Exercise 2.2: Adding and Removing Noise

3.2.1 Problem Setup

3.2.2 Tasks

Solution 2.3: Adding and Removing Noise

3.3 Exercise 2.3: Thresholding and Binary Masks

3.3.1 Problem Setup

3.3.2 Tasks

Solution 2.3: Thresholding and Binary Masks

3.4 Exercise 2.4: Unsharp Masking and Real Medical Images (Urothelial Cells)

3.4.1 Problem Setup

3.4.2 Tasks

Solution 2.4: Unsharp Masking and Real Medical Images

📚 Gradebook

2.2.8 `imshow()` vs. Actually Resizing an Array