6  Bayesian Optimization of Segmentation Thresholds

6.1 Introduction

After preparing and partitioning our dataset in Chapter 5, we now face a critical challenge: finding the optimal threshold values for cell segmentation. Rather than using fixed, predetermined thresholds or exhaustively searching through all possibilities, we’ll employ Bayesian Optimization to intelligently search the threshold parameter space.

In this chapter, we:

  1. Learn how Bayesian Optimization works through intuitive examples and visualizations
  2. Define the optimization problem: minimizing segmentation error on the validation set
  3. Implement Bayesian Optimization to find optimal threshold values (nucleus minimum and cytoplasm maximum)
  4. Apply optimized thresholds to the test set
  5. Comprehensively evaluate results with metrics and visualizations

6.2 Parameters, Pre-parameters, and Hyperparameters

Before diving into the implementation, let’s clarify the different types of parameters we’ll encounter in this pipeline. Understanding this distinction is crucial for proper experimental design.

6.2.1 Pre-parameters: Image Processing Pipeline

These are parameters that control the preprocessing pipeline—the transformation applied to images before they reach the prediction model. They affect the quality and characteristics of the predictions themselves.

Pre-parameters in a typical segmentation pipeline:

  1. Denoising (Non-Local Means / NLM)
    • h: Filter strength (higher = more denoising, but loses detail)
    • templateWindowSize: Size of template patch
    • searchWindowSize: Size of search area
  2. Unsharp Masking
    • sigma: Gaussian blur sigma for the unsharp mask
    • strength: Amount of sharpening (0.5 - 2.0)
    • threshold: Pixel difference threshold to apply sharpening
  3. Gaussian Blur
    • sigma: Standard deviation of the Gaussian kernel
    • kernelSize: Size of the convolution kernel
  4. Morphological Operations
    • kernelSize: Size of the structural element
    • kernelShape: Shape (rectangular, elliptical, cross)
    • iterations: Number of dilation/erosion iterations
  5. Contrast Enhancement
    • clipLimit: Threshold for contrast limiting (CLAHE)
    • tileGridSize: Size of grid for local histogram equalization
  6. Normalization
    • targetMean: Desired mean pixel intensity
    • targetStd: Desired standard deviation

Key point: These pre-parameters affect the input to the thresholding step. Poor preprocessing choices here can make threshold optimization impossible—garbage in, garbage out.

6.2.2 Parameters: Threshold Optimization

These are the parameters being directly optimized by Bayesian Optimization. They are applied after predictions are generated.

Parameters in Chapter 6:

  1. Nucleus Minimum Threshold (nucleus_min)
    • Range: [0.3, 0.7]
    • Meaning: Minimum probability to classify a pixel as nucleus
  2. Cytoplasm Maximum Threshold (cytoplasm_max)
    • Range: [0.6, 0.95]
    • Meaning: Maximum probability threshold for cytoplasm classification

Terminology note: Should we call these “parameters”? - Yes, “parameters” is appropriate here. They parameterize the decision boundary (the threshold function). - They’re distinct from “hyperparameters” because they’re not part of the training procedure—they’re part of the inference (prediction application) procedure. - They’re distinct from “pre-parameters” because they operate on already-generated predictions, not on raw images. - Alternative names: “post-processing parameters” or “decision threshold parameters”

6.2.3 Hyperparameters: Bayesian Optimization Configuration

These parameters control the optimization procedure itself—how Bayesian Optimization searches the parameter space. They don’t directly affect segmentation quality, but rather control the search strategy.

Hyperparameters for Bayesian Optimization:

  1. Gaussian Process Kernel Parameters
    • length_scale: Controls how quickly correlation decays with distance
      • Low (0.05): Assumes function changes rapidly
      • High (0.5): Assumes function is smooth
    • kernel_variance: Overall scale of the kernel
  2. Acquisition Function Parameters
    • kappa: Exploration vs. exploitation trade-off
      • Low (0.5): Exploit known good regions
      • High (3.0): Explore uncertain regions
    • Higher kappa early on, lower kappa later is common
  3. Search Strategy
    • n_initial: Number of random initial evaluations (default: 5)
    • n_iterations: Number of Bayesian optimization iterations (default: 15)
    • n_restarts: Number of L-BFGS-B restarts when optimizing acquisition function (default: 5)
  4. Optimization Algorithm Parameters
    • optimizer: Which algorithm to use for acquisition maximization (L-BFGS-B, differential evolution, etc.)
    • tolerance: Convergence tolerance for the optimizer

Key distinction: Changing kappa doesn’t change what thresholds are optimal—it only changes how fast Bayesian Optimization finds them.


6.3 Summary: The Three Parameter Types

Type Controls Example Optimized By Affects
Pre-parameters Image preprocessing Gaussian blur sigma Manual tuning on validation set Prediction quality
Parameters Threshold decision boundaries nucleus_min, cytoplasm_max Bayesian Optimization (on validation set) How predictions are converted to labels
Hyperparameters Optimization procedure kappa, length_scale, n_iterations Manual or grid search Speed/efficiency of finding optimal parameters

6.4 Introduction to Bayesian Optimization

6.4.1 The Problem

We have a black-box function that we want to optimize:

\[f(\theta) = \text{Segmentation Error given threshold parameters } \theta\]

where \(\theta = (\theta_{\text{nucleus}}, \theta_{\text{cytoplasm}})\) are our two threshold values.

Key challenges: - Each function evaluation is expensive (requires processing all validation images) - The function landscape is unknown (black-box) - We cannot compute gradients directly - We want to find optimal thresholds with as few evaluations as possible

Traditional approaches fail: - Grid search: Exhaustive but inefficient (if we test 20×20 = 400 combinations, each requiring full dataset processing) - Random search: Unguided and wasteful - Gradient descent: Requires continuous derivatives (not suitable here)

Bayesian Optimization succeeds because it learns from each evaluation and intelligently selects the next point to test, balancing exploration (testing uncertain regions) with exploitation (refining promising areas).

6.4.2 How Bayesian Optimization Works

Bayesian Optimization operates in a loop:

  1. Build a probabilistic model (Gaussian Process) from past observations
  2. Use an acquisition function (Upper Confidence Bound) to decide where to sample next
  3. Evaluate the function at the selected point
  4. Repeat until convergence

Gaussian Process (GP): Our Belief about the Function

The GP provides two pieces of information at any point:

  • Mean \(\mu(x)\): Our best estimate of \(f(x)\)
  • Variance \(\sigma^2(x)\): Our uncertainty about \(f(x)\)

Near observed points: Low uncertainty (we know what the function is) Far from observations: High uncertainty (complete ignorance)

The GP uses an RBF (Radial Basis Function) kernel to correlate nearby points:

\[k(x, x') = \exp\left(-\frac{\|x - x'\|^2}{2\ell^2}\right)\]

where \(\ell\) is the length scale controlling how quickly correlation decays with distance.

Acquisition Function: Where to Sample Next?

The Upper Confidence Bound (UCB) acquisition function balances two competing goals:

\[\text{UCB}(x) = \mu(x) + \kappa \cdot \sigma(x)\]

  • First term (\(\mu(x)\)): Exploitation—favors regions with high predicted function values
  • Second term (\(\kappa \cdot \sigma(x)\)): Exploration—favors uncertain regions where we might find something better

The parameter \(\kappa\) controls the trade-off: - Low \(\kappa\): Exploit known good regions - High \(\kappa\): Explore uncertain areas (especially useful early on)

6.4.3 Illustrative Example: The Step Function

To build intuition, imagine optimizing a continuous step function with multiple plateaus and a peak around x* ≈ 1.08.

Initial observations (at x = -2.5, -2.0): - Very limited information about the function - GP is highly uncertain everywhere except the observed points - Uncertainty “sausage” is very wide

Iteration 1: First exploration - UCB is maximized around x = 0.246 (shown with green star) - Why x = 0.246 and not x = 1.0 or beyond? - At x = 1.0: Mean has decayed nearly to zero (far from observations) - At x = 0.246: Mean is still positive (RBF kernel maintains correlation from nearby observations) - The “sweet spot” balances mean (which is extrapolating toward zero) with variance (which is high everywhere)

Result: We discover f(0.246) = 3.398 (huge improvement from 0.5!)

Key insight: Found the optimum in just 6 evaluations instead of blindly sampling hundreds of combinations.

6.4.4 Why Bayesian Optimization Works for Threshold Tuning

In our cell segmentation problem:

  • Evaluation cost: Each threshold combination requires processing ~100-1000 validation images and computing segmentation metrics
  • Dimension: Only 2 parameters (nucleus min, cytoplasm max)—perfect for Bayesian Optimization
  • Smoothness: The error landscape is likely smooth and unimodal (one dominant peak)
  • Efficiency goal: We want to find optimal thresholds with minimal computational cost

Bayesian Optimization will strategically explore this 2D parameter space, converging to optimal thresholds much faster than grid search.


6.5 Complete Implementation

Let’s implement Bayesian Optimization using scikit-optimize’s gp_minimize() function. This section is executable - run it cell by cell in Colab or Jupyter.

6.5.1 Preliminary Functions

We will need several sets of functions as part of the pipeline including the Metrics, Data Cleaning Functions, and the main Pipeline Functions:

Loading of Libraries and the Data


from skimage.io import imread
import matplotlib.pyplot as plt
import numpy as np, pandas as pd
from skimage.color import rgb2gray
from skimage.filters import sobel, scharr, apply_hysteresis_threshold, gaussian
from skimage.feature import canny
from skimage.measure import regionprops,regionprops_table
from skimage.morphology import binary_opening, disk, binary_closing, binary_dilation, binary_erosion
from skimage.draw import rectangle_perimeter
from scipy.ndimage import label as scilabel
import cv2
import matplotlib; matplotlib.rcParams['figure.dpi']=300
from scipy.ndimage import gaussian_filter
from skimage.restoration import denoise_nl_means
from skimage.filters import unsharp_mask

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
from scipy.optimize import minimize
from sklearn.model_selection import train_test_split


# ============================================================================
# LOAD CEDARS SINAI UROTHELIAL CELL DATA
# ============================================================================

# Note: If running in Jupyter/Colab, uncomment and run:
# !git clone https://github.com/emilsar/Cedars.git
# %cd Cedars/Project3
# !python 1_prepdata.py

# Load the preprocessed data — tries Windows path first, falls back to local Cedars clone
try:
    urothelial_cells = pd.read_pickle("C:/Cedars/Project3/urothelial_cell_toy_data.pkl")
except FileNotFoundError:
    urothelial_cells = pd.read_pickle("Cedars/Project3/urothelial_cell_toy_data.pkl")

# Convert to uint8 image format (0-255)
# Original shape: (batch, channels, height, width)
# Target shape: (batch, height, width, channels)
images = np.transpose(urothelial_cells["X"].numpy() * 255, (0, 2, 3, 1)).astype(np.uint8)
labels = urothelial_cells["y"]

Metrics

import numpy as np

def dice_coefficient(pred, target, class_label):
    """
    Calculate Dice coefficient for a specific class.
    
    Parameters:
    -----------
    pred : np.ndarray
        Predicted segmentation (integer labels)
    target : np.ndarray
        Ground truth segmentation (integer labels)
    class_label : int
        The class to evaluate (0, 1, or 2)
    
    Returns:
    --------
    dice : float
        Dice coefficient for this class (0 to 1)
    """
    pred_binary = (pred == class_label).astype(float)
    target_binary = (target == class_label).astype(float)
    
    intersection = np.sum(pred_binary * target_binary)
    sum_pred = np.sum(pred_binary)
    sum_target = np.sum(target_binary)
    
    # Handle edge case where both pred and target are empty
    if sum_pred == 0 and sum_target == 0:
        return 1.0
    
    if sum_pred + sum_target == 0:
        return 0.0
    
    dice = 2 * intersection / (sum_pred + sum_target)
    return dice

def iou_coefficient(pred, target, class_label):
    """
    Calculate Intersection over Union (IoU) for a specific class.
    
    Parameters:
    -----------
    pred : np.ndarray
        Predicted segmentation (integer labels)
    target : np.ndarray
        Ground truth segmentation (integer labels)
    class_label : int
        The class to evaluate (0, 1, or 2)
    
    Returns:
    --------
    iou : float
        IoU for this class (0 to 1)
    """
    pred_binary = (pred == class_label).astype(float)
    target_binary = (target == class_label).astype(float)
    
    intersection = np.sum(pred_binary * target_binary)
    union = np.sum(np.logical_or(pred_binary, target_binary))
    
    # Handle edge case where union is empty
    if union == 0:
        return 1.0 if intersection == 0 else 0.0
    
    iou = intersection / union
    return iou

def calculate_all_metrics(pred, target, class_labels=[0, 1, 2]):
    """
    Calculate Dice and IoU for all classes plus mean metrics.
    
    Returns:
    --------
    metrics : dict
        Dictionary containing per-class and mean metrics
    """
    metrics = {}
    
    dice_scores = []
    iou_scores = []
    
    for class_label in class_labels:
        dice = dice_coefficient(pred, target, class_label)
        iou = iou_coefficient(pred, target, class_label)
        
        class_name = ['Background', 'Cytoplasm', 'Nucleus'][class_label]
        metrics[f'Dice_{class_name}'] = dice
        metrics[f'IoU_{class_name}'] = iou
        
        dice_scores.append(dice)
        iou_scores.append(iou)
    
    metrics['Dice_Mean'] = np.mean(dice_scores)
    metrics['IoU_Mean'] = np.mean(iou_scores)
    
    return metrics

def print_metrics(metrics):
    """Pretty-print metrics table."""
    print("=" * 70)
    print(f"{'Class':<15} {'Dice':<15} {'IoU':<15}")
    print("=" * 70)
    
    class_names = ['Background', 'Cytoplasm', 'Nucleus']
    for i, class_name in enumerate(class_names):
        dice = metrics[f'Dice_{class_name}']
        iou = metrics[f'IoU_{class_name}']
        print(f"{class_name:<15} {dice:<15.4f} {iou:<15.4f}")
    
    print("-" * 70)
    print(f"{'Mean':<15} {metrics['Dice_Mean']:<15.4f} {metrics['IoU_Mean']:<15.4f}")
    print("=" * 70)

def compute_aggregate_metrics(all_metrics):
    """
    Compute mean and standard deviation of metrics across all images.
    
    Parameters:
    -----------
    all_metrics : list of dict
        Metrics for each image
    
    Returns:
    --------
    aggregate : dict
        Mean and std for each metric
    """
    
    aggregate = {}
    
    # Get all metric keys from first image
    if not all_metrics:
        return aggregate
    
    metric_keys = all_metrics[0].keys()
    
    for key in metric_keys:
        values = [m[key] for m in all_metrics]
        aggregate[f'{key}_mean'] = np.mean(values)
        aggregate[f'{key}_std'] = np.std(values)
    
    return aggregate

def print_aggregate_metrics(aggregate_metrics):
    """Pretty-print aggregate metrics."""
    print("\n" + "="*85)
    print(f"{'Metric':<25} {'Mean':<20} {'Std Dev':<20}")
    print("="*85)
    
    class_names = ['Background', 'Cytoplasm', 'Nucleus']
    
    # Print per-class Dice
    for class_name in class_names:
        key = f'Dice_{class_name}'
        mean = aggregate_metrics.get(f'{key}_mean', 0)
        std = aggregate_metrics.get(f'{key}_std', 0)
        print(f"{key:<25} {mean:<20.4f} {std:<20.4f}")
    
    print("-"*85)
    
    # Print Mean Dice
    key = 'Dice_Mean'
    mean = aggregate_metrics.get(f'{key}_mean', 0)
    std = aggregate_metrics.get(f'{key}_std', 0)
    print(f"{key:<25} {mean:<20.4f} {std:<20.4f}")
    
    print("-"*85)
    
    # Print per-class IoU
    for class_name in class_names:
        key = f'IoU_{class_name}'
        mean = aggregate_metrics.get(f'{key}_mean', 0)
        std = aggregate_metrics.get(f'{key}_std', 0)
        print(f"{key:<25} {mean:<20.4f} {std:<20.4f}")
    
    print("-"*85)
    
    # Print Mean IoU
    key = 'IoU_Mean'
    mean = aggregate_metrics.get(f'{key}_mean', 0)
    std = aggregate_metrics.get(f'{key}_std', 0)
    print(f"{key:<25} {mean:<20.4f} {std:<20.4f}")
    
    print("="*85)

def visualize_metric_distributions(all_metrics):
    """
    Create histograms showing distribution of metrics across dataset.
    
    Parameters:
    -----------
    all_metrics : list of dict
        Metrics for each image
    """
    
    fig, axes = plt.subplots(2, 3, figsize=(16, 10))
    fig.suptitle('Distribution of Segmentation Metrics Across Dataset', 
                 fontsize=14, fontweight='bold')
    
    # Extract metric values
    dice_bg = [m['Dice_Background'] for m in all_metrics]
    dice_cyto = [m['Dice_Cytoplasm'] for m in all_metrics]
    dice_nuc = [m['Dice_Nucleus'] for m in all_metrics]
    
    iou_bg = [m['IoU_Background'] for m in all_metrics]
    iou_cyto = [m['IoU_Cytoplasm'] for m in all_metrics]
    iou_nuc = [m['IoU_Nucleus'] for m in all_metrics]
    
    # Dice histograms
    axes[0, 0].hist(dice_bg, bins=30, alpha=0.7, color='blue', edgecolor='black')
    axes[0, 0].set_title('Dice - Background', fontweight='bold')
    axes[0, 0].set_xlabel('Dice Score')
    axes[0, 0].set_ylabel('Frequency')
    axes[0, 0].grid(alpha=0.3)
    axes[0, 0].axvline(np.mean(dice_bg), color='red', linestyle='--', linewidth=2, 
                       label=f'Mean: {np.mean(dice_bg):.3f}')
    axes[0, 0].legend()
    
    axes[0, 1].hist(dice_cyto, bins=30, alpha=0.7, color='orange', edgecolor='black')
    axes[0, 1].set_title('Dice - Cytoplasm', fontweight='bold')
    axes[0, 1].set_xlabel('Dice Score')
    axes[0, 1].set_ylabel('Frequency')
    axes[0, 1].grid(alpha=0.3)
    axes[0, 1].axvline(np.mean(dice_cyto), color='red', linestyle='--', linewidth=2,
                       label=f'Mean: {np.mean(dice_cyto):.3f}')
    axes[0, 1].legend()
    
    axes[0, 2].hist(dice_nuc, bins=30, alpha=0.7, color='green', edgecolor='black')
    axes[0, 2].set_title('Dice - Nucleus', fontweight='bold')
    axes[0, 2].set_xlabel('Dice Score')
    axes[0, 2].set_ylabel('Frequency')
    axes[0, 2].grid(alpha=0.3)
    axes[0, 2].axvline(np.mean(dice_nuc), color='red', linestyle='--', linewidth=2,
                       label=f'Mean: {np.mean(dice_nuc):.3f}')
    axes[0, 2].legend()
    
    # IoU histograms
    axes[1, 0].hist(iou_bg, bins=30, alpha=0.7, color='blue', edgecolor='black')
    axes[1, 0].set_title('IoU - Background', fontweight='bold')
    axes[1, 0].set_xlabel('IoU Score')
    axes[1, 0].set_ylabel('Frequency')
    axes[1, 0].grid(alpha=0.3)
    axes[1, 0].axvline(np.mean(iou_bg), color='red', linestyle='--', linewidth=2,
                       label=f'Mean: {np.mean(iou_bg):.3f}')
    axes[1, 0].legend()
    
    axes[1, 1].hist(iou_cyto, bins=30, alpha=0.7, color='orange', edgecolor='black')
    axes[1, 1].set_title('IoU - Cytoplasm', fontweight='bold')
    axes[1, 1].set_xlabel('IoU Score')
    axes[1, 1].set_ylabel('Frequency')
    axes[1, 1].grid(alpha=0.3)
    axes[1, 1].axvline(np.mean(iou_cyto), color='red', linestyle='--', linewidth=2,
                       label=f'Mean: {np.mean(iou_cyto):.3f}')
    axes[1, 1].legend()
    
    axes[1, 2].hist(iou_nuc, bins=30, alpha=0.7, color='green', edgecolor='black')
    axes[1, 2].set_title('IoU - Nucleus', fontweight='bold')
    axes[1, 2].set_xlabel('IoU Score')
    axes[1, 2].set_ylabel('Frequency')
    axes[1, 2].grid(alpha=0.3)
    axes[1, 2].axvline(np.mean(iou_nuc), color='red', linestyle='--', linewidth=2,
                       label=f'Mean: {np.mean(iou_nuc):.3f}')
    axes[1, 2].legend()
    
    plt.tight_layout()
    plt.show()

def calculate_nc_ratio(label_image):
    """
    Calculate the nucleus-to-cytoplasm area ratio.
    
    Parameters:
    -----------
    label_image : np.ndarray
        Segmented label image (0=background, 1=cytoplasm, 2=nucleus)
    
    Returns:
    --------
    nc_ratio : float
        Ratio of nucleus area to cytoplasm area
        Returns 0 if cytoplasm area is 0 (to avoid division by zero)
    """
    nucleus_area = np.sum(label_image == 2)
    cytoplasm_area = np.sum(label_image == 1)
    
    if cytoplasm_area == 0:
        return 0.0
    
    nc_ratio = nucleus_area / cytoplasm_area
    return nc_ratio

def compute_nc_ratios(labels_list):
    """
    Compute nucleus-to-cytoplasm ratios for all images.
    
    Parameters:
    -----------
    labels_list : list of np.ndarray
        List of label images
    
    Returns:
    --------
    nc_ratios : np.ndarray
        Array of NC ratios, one per image
    """
    nc_ratios = np.array([calculate_nc_ratio(label_img) for label_img in labels_list])
    return nc_ratios

def analyze_nc_ratios(labels_true, labels_pred):
    """
    Compute and visualize N/C ratio metrics.
    
    Parameters:
    -----------
    labels_true : list of np.ndarray
        Ground truth label images
    labels_pred : list of np.ndarray
        Predicted segmentation images
    dataset_name : str
        Name for output printing
    """
    
    # Compute N/C ratios
    nc_ratio_true = compute_nc_ratios(labels_true)
    nc_ratio_pred = compute_nc_ratios(labels_pred)
    
    # Calculate errors
    absolute_errors = np.abs(nc_ratio_pred - nc_ratio_true)
    relative_errors = np.abs(nc_ratio_pred - nc_ratio_true) / (nc_ratio_true + 1e-6)
    
    # Print statistics
    print(f"N/C Ratio Statistics:")
    print(f"  True N/C Ratio - Mean: {np.mean(nc_ratio_true):.4f}, Std: {np.std(nc_ratio_true):.4f}")
    print(f"  Pred N/C Ratio - Mean: {np.mean(nc_ratio_pred):.4f}, Std: {np.std(nc_ratio_pred):.4f}")
    print(f"  Absolute Error - Mean: {np.mean(absolute_errors):.4f}, Std: {np.std(absolute_errors):.4f}")
    print(f"  Relative Error - Mean: {np.mean(relative_errors):.4f}, Std: {np.std(relative_errors):.4f}")
    
    # Create scatter plot
    fig, ax = plt.subplots(figsize=(4, 4))
    ax.scatter(nc_ratio_pred, nc_ratio_true, alpha=0.6, s=8)
    ax.plot([0, 5], [0, 5], ls="--", color='red', linewidth=2, label='Perfect Prediction')
    ax.set_xlabel("Predicted N/C Ratio", fontsize=5, fontweight='bold')
    ax.set_ylabel("True N/C Ratio", fontsize=5, fontweight='bold')
    ax.set_title(f"N/C Ratio: Predicted vs. True ({dataset_name})", fontsize=5, fontweight='bold')
    ax.grid(True, alpha=0.3)
    ax.legend(fontsize=11)
    max_val = max(nc_ratio_pred.max(), nc_ratio_true.max()) * 1.1
    ax.set_xlim([0, max_val])
    ax.set_ylim([0, max_val])
    plt.tight_layout()
    plt.show()

Data Cleaning Functions

import numpy as np
from skimage.measure import label

def count_nuclei(label_image, nucleus_value=2):
    """
    Count the number of distinct nuclei (connected components) in a labeled image.
    
    Parameters:
    -----------
    label_image : np.ndarray
        2D array where each pixel is labeled with integer values
        (0=background, 1=cytoplasm, 2=nucleus)
    nucleus_value : int
        The intensity value representing nucleus pixels (default: 2)
    
    Returns:
    --------
    num_nuclei : int
        The number of connected components with nucleus label
    component_map : np.ndarray
        2D array where each nucleus component is assigned a unique ID
        (starting from 1; background is 0)
    """
    
    # Extract only the nucleus pixels (create binary mask)
    nucleus_mask = (label_image == nucleus_value).astype(int)
    
    # Apply connected component labeling
    # Create 2-connectivity structure (8-connected neighborhood)
    from scipy.ndimage import generate_binary_structure
    structure = generate_binary_structure(2, 2)
    component_map, num_nuclei = scilabel(nucleus_mask, structure=structure)
    
    return num_nuclei, component_map

def analyze_nucleus_distribution(labels_list):
    """
    Analyze the distribution of nucleus counts across the entire dataset.
    
    Parameters:
    -----------
    labels_list : list of np.ndarray
        List of all ground truth label images in the dataset
    
    Returns:
    --------
    nucleus_counts : np.ndarray
        Array where index i contains the count of images with i nuclei
    image_nucleus_info : list of tuples
        List of (image_index, num_nuclei) for each image
    """
    
    image_nucleus_info = []
    max_nuclei = 0
    
    for idx, label_image in enumerate(labels_list):
        num_nuclei, _ = count_nuclei(label_image, nucleus_value=2)
        image_nucleus_info.append((idx, num_nuclei))
        max_nuclei = max(max_nuclei, num_nuclei)
    
    # Create histogram
    nucleus_counts = np.zeros(max_nuclei + 1, dtype=int)
    for idx, num_nuclei in image_nucleus_info:
        nucleus_counts[num_nuclei] += 1
    
    return nucleus_counts, image_nucleus_info

def filter_images_by_nucleus_count(images, labels, min_nuclei=1, max_nuclei=1):
    """
    Filter images to keep only those with an acceptable number of nuclei.
    
    Parameters:
    -----------
    images : list of np.ndarray
        List of raw cell images
    labels : list of np.ndarray
        List of corresponding label images
    min_nuclei : int
        Minimum acceptable number of nuclei (default: 1)
    max_nuclei : int
        Maximum acceptable number of nuclei (default: 1)
    
    Returns:
    --------
    filtered_images : list of np.ndarray
        Filtered image list
    filtered_labels : list of np.ndarray
        Filtered label list
    valid_indices : list of int
        Indices of images that passed the filter in original dataset
    filtering_report : dict
        Detailed statistics about the filtering
    """
    
    valid_indices = []
    excluded_indices = []
    excluded_reasons = {
        'no_nucleus': [],
        'too_few_nuclei': [],
        'too_many_nuclei': []
    }
    
    for idx, label_img in enumerate(labels):
        num_nuclei, _ = count_nuclei(label_img, nucleus_value=2)
        
        if num_nuclei == 0:
            excluded_indices.append(idx)
            excluded_reasons['no_nucleus'].append(idx)
        elif num_nuclei < min_nuclei:
            excluded_indices.append(idx)
            excluded_reasons['too_few_nuclei'].append(idx)
        elif num_nuclei > max_nuclei:
            excluded_indices.append(idx)
            excluded_reasons['too_many_nuclei'].append(idx)
        else:
            valid_indices.append(idx)
    
    filtered_images = [images[i] for i in valid_indices]
    filtered_labels = [labels[i] for i in valid_indices]
    
    filtering_report = {
        'original_count': len(images),
        'filtered_count': len(filtered_images),
        'excluded_count': len(excluded_indices),
        'no_nucleus': len(excluded_reasons['no_nucleus']),
        'too_few_nuclei': len(excluded_reasons['too_few_nuclei']),
        'too_many_nuclei': len(excluded_reasons['too_many_nuclei']),
        'exclusion_rate': len(excluded_indices) / len(images) * 100
    }
    
    return filtered_images, filtered_labels, valid_indices, filtering_report

def visualize_nucleus_distribution(nucleus_counts, labels_list=None):
    """
    Visualize the distribution of nucleus counts in the dataset.
    
    Parameters:
    -----------
    nucleus_counts : np.ndarray
        Array where index i contains count of images with i nuclei
    labels_list : list of np.ndarray, optional
        If provided, compute distribution from scratch
    """
    
    if labels_list is not None:
        nucleus_counts, _ = analyze_nucleus_distribution(labels_list)
    
    fig, ax = plt.subplots(figsize=(5,3))
    
    nuclei_indices = np.arange(len(nucleus_counts))
    colors = ['red' if i != 1 else 'green' for i in nuclei_indices]
    
    bars = ax.bar(nuclei_indices, nucleus_counts, color=colors, alpha=0.7, edgecolor='black')
    
    ax.set_xlabel('Number of Nuclei per Image', fontsize=5, fontweight='bold')
    ax.set_ylabel('Count of Images', fontsize=5, fontweight='bold')
    ax.set_title('Distribution of Nucleus Counts in Dataset', fontsize=5, fontweight='bold')
    ax.set_xticks(nuclei_indices)
    ax.grid(axis='y', alpha=0.3)
    
    # Add count labels on bars
    for bar in bars:
        height = bar.get_height()
        if height > 0:
            ax.text(bar.get_x() + bar.get_width()/2., height,
                   f'{int(height)}',
                   ha='center', va='bottom', fontweight='bold')
    
    # Add legend
    from matplotlib.patches import Patch
    legend_elements = [
        Patch(facecolor='green', alpha=0.7, edgecolor='black', label='Keep (1 nucleus)'),
        Patch(facecolor='red', alpha=0.7, edgecolor='black', label='Exclude (not 1 nucleus)')
    ]
    ax.legend(handles=legend_elements, fontsize=11, loc='upper right')
    
    plt.tight_layout()
    plt.show()
    
    # Print statistics
    print("\n" + "="*70)
    print("NUCLEUS DISTRIBUTION STATISTICS")
    print("="*70)
    print(f"{'# Nuclei':<15} {'Count':<15} {'Percentage':<15}")
    print("-"*70)
    total = np.sum(nucleus_counts)
    for i, count in enumerate(nucleus_counts):
        percentage = count / total * 100 if total > 0 else 0
        status = "[KEEP]" if i == 1 else "[EXCLUDE]"
        print(f"{i:<15} {count:<15} {percentage:<15.2f}%  {status}")
    print("="*70)

Pipeline Functions

def apply_stage4_pipeline(img, labels_true, params):
    """
    Apply the complete 4-stage preprocessing pipeline to a single image.
    
    Parameters:
    -----------
    img : np.ndarray
        RGB image (H, W, 3)
    labels_true : np.ndarray
        Ground truth segmentation labels
    params : dict
        Pipeline parameters:
        - nucleus_max, cytoplasm_min, cytoplasm_max: thresholds
        - nl_means_strength: denoising parameter h
        - unsharp_radius, unsharp_amount: edge enhancement
        - blur_sigma: Gaussian blur standard deviation
        - morph_disk_size: morphological opening kernel
        - closing_disk_size: morphological closing kernel
    
    Returns:
    --------
    segmented : np.ndarray
        Final predicted segmentation
    metrics : dict
        Calculated metrics (Dice, IoU for each class)
    """
    
    # STAGE 1: Grayscale Conversion
    img_uint8 = img.astype(np.uint8)
    img_gray = cv2.cvtColor(img_uint8, cv2.COLOR_RGB2GRAY).astype(np.float32) / 255.0
    
    # STAGE 2: Non-Local Means Denoising
    img_denoised = denoise_nl_means(
        img_gray,
        h=params['nl_means_strength'],
        fast_mode=True,
        patch_size=10,
        patch_distance=10
    )
    
    # STAGE 3: Edge Enhancement and Blur
    img_enhanced = unsharp_mask(
        img_denoised,
        radius=params['unsharp_radius'],
        amount=params['unsharp_amount']
    )
    img_blurred = gaussian_filter(img_enhanced, sigma=params['blur_sigma'])
    
    # Intensity-Based Segmentation
    nucleus_mask = (img_blurred < params['nucleus_max']).astype(int)
    cytoplasm_mask = np.logical_and(
        img_blurred >= params['cytoplasm_min'],
        img_blurred <= params['cytoplasm_max']
    ).astype(int)
    
    # STAGE 4: Morphological Operations
    nucleus_opened = binary_opening(nucleus_mask, disk(params['morph_disk_size'])).astype(int)
    cytoplasm_opened = binary_opening(cytoplasm_mask, disk(params['morph_disk_size'])).astype(int)
    
    nucleus_closed = binary_closing(nucleus_opened, disk(params['closing_disk_size'])).astype(int)
    cytoplasm_closed = binary_closing(cytoplasm_opened, disk(params['closing_disk_size'])).astype(int)
    
    # Combine into 3-class segmentation
    segmented = np.zeros_like(img_blurred, dtype=int)
    segmented[nucleus_closed == 1] = 2
    segmented[cytoplasm_closed == 1] = 1
    
    # Calculate metrics
    metrics = calculate_all_metrics(segmented, labels_true, class_labels=[0, 1, 2])
    
    return segmented, metrics

def apply_pipeline_to_dataset(images, labels, params, verbose=True):
    """
    Apply the preprocessing pipeline to all images in the dataset.
    
    Parameters:
    -----------
    images : list of np.ndarray
        List of RGB images
    labels : list of np.ndarray
        List of ground truth segmentations
    params : dict
        Pipeline parameters
    verbose : bool
        Print progress updates
    
    Returns:
    --------
    segmented_images : list of np.ndarray
        Preprocessed segmentations for all images
    all_metrics : list of dict
        Metrics for each image
    aggregate_metrics : dict
        Mean and std of metrics across dataset
    """
    
    segmented_images = []
    all_metrics = []
    
    total_images = len(images)
    
    for idx, (img, label_true) in enumerate(zip(images, labels)):
        if verbose and (idx + 1) % 10 == 0:
            print(f"Processing image {idx + 1}/{total_images}...")
        
        segmented, metrics = apply_stage4_pipeline(img, label_true, params)
        segmented_images.append(segmented)
        all_metrics.append(metrics)
    
    # Compute aggregate metrics
    aggregate_metrics = compute_aggregate_metrics(all_metrics)
    
    if verbose:
        print(f"Completed processing all {total_images} images.")
    
    return segmented_images, all_metrics, aggregate_metrics

6.5.2 Filter Data

from skopt import gp_minimize
from skopt.space import Real
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings('ignore')

print("✓ All Bayesian Optimization libraries loaded")

# ============================================================================
# FILTER DATA: Keep only images with exactly 1 nucleus
# ============================================================================

print("\n" + "="*70)
print("FILTERING DATA: Removing images with 0, >1 nuclei")
print("="*70)

filtered_images, filtered_labels, valid_indices, report = filter_images_by_nucleus_count(
    images, labels, min_nuclei=1, max_nuclei=1
)

print(f"\nFiltering Report:")
print(f"  Original images: {report['original_count']}")
print(f"  Filtered images: {report['filtered_count']}")
print(f"  Excluded: {report['excluded_count']} ({report['exclusion_rate']:.1f}%)")

6.5.3 Train/Validation/Test Split

print("\n" + "="*70)
print("TRAIN/VALIDATION/TEST SPLIT (60/20/20)")
print("="*70)

# Split cleaned data
train_idx, temp_idx = train_test_split(
    np.arange(len(filtered_images)), 
    test_size=0.4, 
    random_state=42
)

val_idx, test_idx = train_test_split(
    temp_idx, 
    test_size=0.5, 
    random_state=42
)

train_images = np.array([filtered_images[i] for i in train_idx])
train_labels = np.array([filtered_labels[i] for i in train_idx])

val_images = np.array([filtered_images[i] for i in val_idx])
val_labels = np.array([filtered_labels[i] for i in val_idx])

test_images = np.array([filtered_images[i] for i in test_idx])
test_labels = np.array([filtered_labels[i] for i in test_idx])

print(f"Total cleaned images: {len(filtered_images)}")
print(f"Train set: {len(train_images)} ({len(train_images)/len(filtered_images)*100:.1f}%)")
print(f"Validation set: {len(val_images)} ({len(val_images)/len(filtered_images)*100:.1f}%)")
print(f"Test set: {len(test_images)} ({len(test_images)/len(filtered_images)*100:.1f}%)")

6.5.4 Bayesian Optimization Objective Function

# ============================================================================
# BAYESIAN OPTIMIZATION OBJECTIVE FUNCTION
# ============================================================================

def evaluate_pipeline_loss(params):
    """
    Objective function for Bayesian Optimization.
    
    Takes a list [nucleus_max, cytoplasm_max] and returns negative Dice.
    cytoplasm_min is automatically set to nucleus_max.
    
    Evaluates on TRAINING set to find optimal parameters.
    """
    nucleus_max, cytoplasm_max = params
    cytoplasm_min = nucleus_max  # Automatically set
    
    # Apply pipeline with these parameters
    pipeline_params = {
        'nucleus_max': nucleus_max,
        'cytoplasm_min': cytoplasm_min,
        'cytoplasm_max': cytoplasm_max,
        'nl_means_strength': 1.0,
        'unsharp_radius': 2.0,
        'unsharp_amount': 1.0,
        'blur_sigma': 2.0,
        'morph_disk_size': 5,
        'closing_disk_size': 8
    }
    
    segmented, _, agg_metrics = apply_pipeline_to_dataset(
        train_images, train_labels, pipeline_params, verbose=False
    )
    
    # Loss is negative mean Dice (to minimize)
    loss = -agg_metrics['Dice_Mean_mean']
    return loss

6.5.5 Callback Function for Iteration Display

# ============================================================================
# CALLBACK FUNCTION TO DISPLAY ITERATION RESULTS
# ============================================================================

iteration_count = [0]  # Use list to allow modification in nested function

def callback(result):
    """Called after each iteration to display results."""
    iteration_count[0] += 1
    current_loss = result.func_vals[-1]
    best_loss = result.fun
    current_params = result.x_iters[-1]
    best_params = result.x
    
    print(f"Iter {iteration_count[0]}: nucleus_max={current_params[0]:.4f}, "
          f"cytoplasm_max={current_params[1]:.4f}, loss={current_loss:.4f}")
    print(f"  → Best so far: nucleus_max={best_params[0]:.4f}, "
          f"cytoplasm_max={best_params[1]:.4f}, loss={best_loss:.4f}")

print("✓ Callback function defined")

6.5.6 Run Bayesian Optimization

print("\n" + "="*70)
print("STEP 1: BAYESIAN OPTIMIZATION OF PIPELINE PARAMETERS")
print("="*70)

# Define parameter space
space = [
    Real(0.2, 0.5, name='nucleus_max'),
    Real(0.5, 1.0, name='cytoplasm_max')
]

# Run Bayesian Optimization
print("\nRunning Bayesian Optimization...")
print("(This may take a few minutes)")
print("-" * 70)

opt_results = gp_minimize(
    func=evaluate_pipeline_loss,
    dimensions=space,
    n_calls=20,              # 5 initial + 15 BO iterations
    n_initial_points=5,
    acq_func='LCB',          # Lower Confidence Bound (minimizes loss)
    kappa=2.0,
    random_state=42,
    callback=callback,
    verbose=0
)

# Extract best parameters
best_nucleus_max = opt_results.x[0]
best_cytoplasm_max = opt_results.x[1]
best_cytoplasm_min = best_nucleus_max  # Automatically set
best_loss = opt_results.fun
best_dice = -best_loss

print("\n" + "="*70)
print("OPTIMIZATION COMPLETE")
print("="*70)
print(f"\nBest Parameters Found:")
print(f"  Nucleus Max:      {best_nucleus_max:.6f}")
print(f"  Cytoplasm Min:    {best_cytoplasm_min:.6f} (= Nucleus Max)")
print(f"  Cytoplasm Max:    {best_cytoplasm_max:.6f}")
print(f"  Training Loss:    {best_loss:.6f}")
print(f"  Training Dice:    {best_dice:.6f}")
print(f"\nTotal Evaluations: {len(opt_results.func_vals)}")

6.5.7 Evaluate on All Sets

# ============================================================================
# EVALUATE OPTIMIZED PARAMETERS ON ALL SETS
# ============================================================================

print("\n" + "="*70)
print("STEP 2: EVALUATING OPTIMIZED PARAMETERS ON ALL SETS")
print("="*70)

best_pipeline_params = {
    'nucleus_max': best_nucleus_max,
    'cytoplasm_min': best_cytoplasm_min,
    'cytoplasm_max': best_cytoplasm_max,
    'nl_means_strength': 1.0,
    'unsharp_radius': 2.0,
    'unsharp_amount': 1.0,
    'blur_sigma': 2.0,
    'morph_disk_size': 5,
    'closing_disk_size': 8
}

print("\nApplying optimized parameters to training set...")
train_segmented, train_metrics, train_agg = apply_pipeline_to_dataset(
    train_images, train_labels, best_pipeline_params, verbose=False
)

print("Applying optimized parameters to validation set...")
val_segmented, val_metrics, val_agg = apply_pipeline_to_dataset(
    val_images, val_labels, best_pipeline_params, verbose=False
)

print("Applying optimized parameters to test set...")
test_segmented, test_metrics, test_agg = apply_pipeline_to_dataset(
    test_images, test_labels, best_pipeline_params, verbose=False
)

print("✓ Evaluation complete")

print("\n" + "="*70)
print("RESULTS WITH OPTIMIZED PARAMETERS")
print("="*70)

print("\nTRAINING SET:")
for class_name in ['Background', 'Cytoplasm', 'Nucleus']:
    dice_mean = train_agg[f'Dice_{class_name}_mean']
    dice_std = train_agg[f'Dice_{class_name}_std']
    print(f"  Dice_{class_name}: {dice_mean:.4f} ± {dice_std:.4f}")
print(f"  Dice_Mean: {train_agg['Dice_Mean_mean']:.4f}")

print("\nVALIDATION SET:")
for class_name in ['Background', 'Cytoplasm', 'Nucleus']:
    dice_mean = val_agg[f'Dice_{class_name}_mean']
    dice_std = val_agg[f'Dice_{class_name}_std']
    print(f"  Dice_{class_name}: {dice_mean:.4f} ± {dice_std:.4f}")
print(f"  Dice_Mean: {val_agg['Dice_Mean_mean']:.4f}")

print("\nTEST SET:")
for class_name in ['Background', 'Cytoplasm', 'Nucleus']:
    dice_mean = test_agg[f'Dice_{class_name}_mean']
    dice_std = test_agg[f'Dice_{class_name}_std']
    print(f"  Dice_{class_name}: {dice_mean:.4f} ± {dice_std:.4f}")
print(f"  Dice_Mean: {test_agg['Dice_Mean_mean']:.4f}")

6.6 Visualization and Evaluation

6.6.1 Convergence Plot

import matplotlib.pyplot as plt

print("\n" + "="*70)
print("STEP 3: VISUALIZATION AND RESULTS")
print("="*70)

# Create convergence plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Loss convergence
axes[0].plot(opt_results.func_vals, 'o-', linewidth=2, markersize=6, color='steelblue')
axes[0].axhline(opt_results.fun, color='red', linestyle='--', linewidth=2, label='Best loss')
axes[0].set_xlabel('Iteration', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Loss (Negative Dice)', fontsize=12, fontweight='bold')
axes[0].set_title('Bayesian Optimization Convergence', fontsize=12, fontweight='bold')
axes[0].grid(True, alpha=0.3)
axes[0].legend()

# Plot 2: Parameter trajectory
nucleus_max_vals = [opt_results.x_iters[i][0] for i in range(len(opt_results.x_iters))]
cytoplasm_max_vals = [opt_results.x_iters[i][1] for i in range(len(opt_results.x_iters))]

axes[1].scatter(nucleus_max_vals, cytoplasm_max_vals, c=range(len(nucleus_max_vals)), 
               cmap='viridis', s=100, alpha=0.7, edgecolors='black', linewidth=1)
axes[1].scatter(best_nucleus_max, best_cytoplasm_max, color='red', s=300, marker='*', 
               edgecolors='black', linewidth=2, label='Best', zorder=5)
axes[1].set_xlabel('Nucleus Max', fontsize=12, fontweight='bold')
axes[1].set_ylabel('Cytoplasm Max', fontsize=12, fontweight='bold')
axes[1].set_title('Parameter Search Trajectory', fontsize=12, fontweight='bold')
axes[1].grid(True, alpha=0.3)
axes[1].legend()
cbar = plt.colorbar(axes[1].collections[0], ax=axes[1])
cbar.set_label('Iteration', fontweight='bold')

plt.tight_layout()
plt.savefig('images/chapter6/chapter6_optimization_convergence.png', dpi=150, bbox_inches='tight')
plt.show()

print("✓ Convergence plot saved")

6.6.2 Metrics Comparison

# Create comprehensive metrics comparison
fig, axes = plt.subplots(3, 1, figsize=(12, 10))

datasets = ['Train', 'Validation', 'Test']
agg_metrics = [train_agg, val_agg, test_agg]
colors = ['#2E86AB', '#A23B72', '#F18F01']

# Plot 1: Per-class Dice
class_names = ['Background', 'Cytoplasm', 'Nucleus']
for i, (dataset_name, agg, color) in enumerate(zip(datasets, agg_metrics, colors)):
    dice_values = [agg[f'Dice_{cn}_mean'] for cn in class_names]
    dice_stds = [agg[f'Dice_{cn}_std'] for cn in class_names]
    axes[0].errorbar([j + i*0.25 for j in range(3)], dice_values, yerr=dice_stds, 
                     label=dataset_name, marker='o', capsize=5, color=color, linewidth=2)

axes[0].set_ylabel('Dice Score', fontsize=11, fontweight='bold')
axes[0].set_title('Per-Class Dice Scores with Optimized Parameters', fontsize=12, fontweight='bold')
axes[0].set_xticks([0.25, 1.25, 2.25])
axes[0].set_xticklabels(class_names)
axes[0].legend()
axes[0].grid(True, alpha=0.3, axis='y')
axes[0].set_ylim([0, 1])

# Plot 2: Mean Dice comparison
mean_dices = [agg['Dice_Mean_mean'] for agg in agg_metrics]
mean_stds = [agg['Dice_Mean_std'] for agg in agg_metrics]
bars = axes[1].bar(datasets, mean_dices, yerr=mean_stds, capsize=10, 
                   color=colors, edgecolor='black', linewidth=2, alpha=0.8)
axes[1].set_ylabel('Mean Dice Score', fontsize=11, fontweight='bold')
axes[1].set_title('Overall Performance Across Train/Val/Test Sets', fontsize=12, fontweight='bold')
axes[1].set_ylim([0, 1])
axes[1].grid(True, alpha=0.3, axis='y')

# Add value labels on bars
for bar, val in zip(bars, mean_dices):
    height = bar.get_height()
    axes[1].text(bar.get_x() + bar.get_width()/2., height,
                f'{val:.4f}', ha='center', va='bottom', fontweight='bold')

# Plot 3: Generalization gap
train_mean = train_agg['Dice_Mean_mean']
val_mean = val_agg['Dice_Mean_mean']
test_mean = test_agg['Dice_Mean_mean']

gaps = [0, train_mean - val_mean, train_mean - test_mean]
gap_labels = ['Baseline', 'Train→Val', 'Train→Test']
axes[2].bar(gap_labels, gaps, color=['gray', '#FF6B6B', '#FF6B6B'], edgecolor='black', linewidth=2)
axes[2].set_ylabel('Dice Difference', fontsize=11, fontweight='bold')
axes[2].set_title('Generalization Gap (Overfitting Check)', fontsize=12, fontweight='bold')
axes[2].axhline(0, color='black', linestyle='-', linewidth=1)
axes[2].grid(True, alpha=0.3, axis='y')

# Add value labels
for i, (label, gap) in enumerate(zip(gap_labels, gaps)):
    axes[2].text(i, gap, f'{gap:.4f}', ha='center', va='bottom' if gap >= 0 else 'top', fontweight='bold')

plt.tight_layout()
plt.savefig('images/chapter6/chapter6_metrics_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

print("✓ Metrics comparison saved")

6.6.3 Summary Statistics

print("\n" + "="*70)
print("GENERALIZATION ANALYSIS")
print("="*70)

print("\nTraining Performance:")
print(f"  Mean Dice: {train_agg['Dice_Mean_mean']:.4f} ± {train_agg['Dice_Mean_std']:.4f}")

print("\nValidation Performance:")
print(f"  Mean Dice: {val_agg['Dice_Mean_mean']:.4f} ± {val_agg['Dice_Mean_std']:.4f}")
print(f"  Drop from training: {(train_agg['Dice_Mean_mean'] - val_agg['Dice_Mean_mean']):.4f}")

print("\nTest Performance:")
print(f"  Mean Dice: {test_agg['Dice_Mean_mean']:.4f} ± {test_agg['Dice_Mean_std']:.4f}")
print(f"  Drop from training: {(train_agg['Dice_Mean_mean'] - test_agg['Dice_Mean_mean']):.4f}")

if abs(train_agg['Dice_Mean_mean'] - val_agg['Dice_Mean_mean']) < 0.05:
    print("\n✓ GOOD: Validation performance is close to training (minimal overfitting)")
else:
    print("\n⚠ WARNING: Large gap between training and validation (possible overfitting)")

if abs(val_agg['Dice_Mean_mean'] - test_agg['Dice_Mean_mean']) < 0.05:
    print("✓ GOOD: Test performance matches validation (parameters generalize)")
else:
    print("⚠ WARNING: Test performance differs from validation")

6.6.4 Testing the optimized Parameters

Let’s now test the optimized thresholds as well as the pre-parameters on the entire database just as we were doing in Chapter 4, and see what has improved in terms of the histogram of Dice coefficients.

print("\nFiltering dataset...")
filtered_images, filtered_labels, valid_indices, report = filter_images_by_nucleus_count(
    images, 
    labels, 
    min_nuclei=1, 
    max_nuclei=1
)

# Example workflow
print("Applying pipeline to entire dataset...")

# Define parameters (determined from tuning on sample images in Chapter 4)
pipeline_params = {
    'nucleus_max':  0.417168,
    'cytoplasm_min': 0.417168,
    'cytoplasm_max': 0.915328,
    'nl_means_strength': 1,
    'unsharp_radius': 2.0,
    'unsharp_amount': 1.0,
    'blur_sigma': 2,
    'morph_disk_size': 5,
    'closing_disk_size': 8
}

#Apply Pipeline.  Returns segmented_images, all_metrics, aggregate_metrics
segmented_images, all_metrics, aggregate_metrics = apply_pipeline_to_dataset(
    filtered_images, 
    filtered_labels, 
    pipeline_params,
    verbose=True
)

print_aggregate_metrics(aggregate_metrics)
visualize_metric_distributions(all_metrics)
analyze_nc_ratios(filtered_labels, segmented_images)

6.7 Final Summary

print("\n" + "="*70)
print("FINAL SUMMARY")
print("="*70)

print(f"\n✓ OPTIMIZED PARAMETERS:")
print(f"  Nucleus Max:       {best_nucleus_max:.6f}")
print(f"  Cytoplasm Min:     {best_cytoplasm_min:.6f} (= Nucleus Max)")
print(f"  Cytoplasm Max:     {best_cytoplasm_max:.6f}")

print(f"\n✓ PERFORMANCE ACROSS SETS:")
print(f"  Training Dice:     {train_agg['Dice_Mean_mean']:.6f}")
print(f"  Validation Dice:   {val_agg['Dice_Mean_mean']:.6f}")
print(f"  Test Dice:         {test_agg['Dice_Mean_mean']:.6f}")

print(f"\n✓ GENERALIZATION:")
train_val_gap = abs(train_agg['Dice_Mean_mean'] - val_agg['Dice_Mean_mean'])
train_test_gap = abs(train_agg['Dice_Mean_mean'] - test_agg['Dice_Mean_mean'])
print(f"  Train→Validation gap: {train_val_gap:.6f}")
print(f"  Train→Test gap:       {train_test_gap:.6f}")

if train_val_gap < 0.05 and train_test_gap < 0.05:
    print(f"\n✓ EXCELLENT: Model generalizes well across all sets!")
elif train_val_gap < 0.10 and train_test_gap < 0.10:
    print(f"\n✓ GOOD: Acceptable generalization")
else:
    print(f"\n⚠ WARNING: Potential overfitting detected")

print(f"\n✓ EFFICIENCY:")
print(f"  Total Evaluations: {len(opt_results.func_vals)}")
print(f"  Grid Equivalence:  ~400 (20×20 grid)")
print(f"  Speedup:           {400/len(opt_results.func_vals):.1f}x")

print(f"\n✓ VISUALIZATIONS SAVED:")
print(f"  - chapter6_optimization_convergence.png")
print(f"  - chapter6_metrics_comparison.png")

print("\n" + "="*70)
print("Chapter 6 Complete!")
print("="*70)

6.8 Key Takeaways

6.8.1 Why Bayesian Optimization Wins

  1. Efficiency: 20 evaluations vs. ~400 for grid search (20x speedup)
  2. Principled: Uses Gaussian Processes for probabilistic modeling
  3. Intelligent: Balances exploration vs. exploitation via acquisition functions
  4. Practical: Works in minutes on standard hardware
  5. Scalable: Handles 2-5 parameters efficiently (grid search becomes prohibitive)

6.8.2 The Optimization Framework

We used gp_minimize() from scikit-optimize with: - Parameter space: nucleus_max ∈ [0.2, 0.5], cytoplasm_max ∈ [0.5, 1.0] - Objective function: Negative mean Dice (loss to minimize) - Acquisition function: Lower Confidence Bound (LCB) for intelligent sampling - Initial points: 5 random samples + 15 Bayesian iterations

6.8.3 Validation & Generalization

The complete workflow: 1. Optimize on training set - find parameters that maximize training performance 2. Validate on validation set - verify parameters generalize to new data 3. Test on test set - final held-out evaluation

Small gaps between training→validation→test metrics indicate good generalization.

6.8.4 Next Steps

Use the optimized parameters in Chapter 7 to refine your segmentation model with proven threshold values.


6.9 Exercises

Exercise 6.1: Modify the acquisition function to ‘EI’ (Expected Improvement) instead of ‘LCB’. How does the search trajectory change?

Exercise 6.2: Expand the parameter bounds: nucleus_max ∈ [0.1, 0.6] and cytoplasm_max ∈ [0.4, 1.0]. Do you find better parameters?

Exercise 6.3: Create a 2D heatmap showing Dice as a function of both parameters (sample ~50 points). Compare with BO trajectory.

Exercise 6.4: Add a third parameter (e.g., nl_means_strength) and re-run optimization. How many additional evaluations are needed?

Exercise 6.5: Compare validation Dice with test Dice for all three classes. Is the generalization gap similar for all classes?

Exercise 6.6: Reduce n_calls to 10 total (3 initial + 7 BO). How does performance suffer compared to 20 evaluations?

Sign in to save progress

0 / 0

📚 Gradebook

Loading…

Sign in to save progress