Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Image segmentation is a fundamental task in image analysis and clinical practice.
  • Current state-of-the-art techniques are based on U-shape type encoder-decoder networks with skip connections (U-Net).
  • U-Net has limitations such as hard coding of the receptive field size, not accounting for inherent noise in the data, problems associated with discrete layers, and no theoretical underpinning.
  • Continuous U-Net is a novel family of networks for image segmentation that introduces dynamic blocks modelled by second order ordinary differential equations.

Paper Content

Introduction

  • Image segmentation is a fundamental task in medical image analysis and clinical practice
  • Manual segmentation is time-consuming and costly
  • Deep learning and FCNNs have enabled automatic segmentation
  • U-Net is a popular model for biomedical image segmentation
  • U-Net has impressive performance but suffers from limitations
  • Continuous U-Net is proposed to address these limitations
  • Continuous U-Net is faster, more robust, noiseless and has underpinning theory
  • Continuous U-Net outperforms existing U-type blocks and competes with other mechanisms
  • Image segmentation is a task used in medical data
  • State-of-the-art techniques rely on U-type architectures
  • Existing techniques are reviewed in this section

U-type nets: a block based perspective

  • U-Net is a model for biomedical image segmentation
  • U-Net consists of four components: neural network layers, downsampling, upsampling and concatenation operations
  • Different U-type variants use different types of blocks
  • ResUNet uses residual blocks
  • DenseUNet uses densely connected blocks
  • Inception blocks use convolutions with varying kernel sizes
  • Pyramid pooling blocks use pooling and convolutions before upsampling

U-type nets with additional mechanisms

  • Zhang et al. introduced Re-sUNet, which uses residual blocks, atrous convolutions, pyramid scene parse pooling, and multi-task inference
  • Attention U-Net uses attention gates to filter features before concatenating the upsampled input and the skip connection
  • DynUNet combines heuristic rules and setting from nnU-Net and the optimisation scheme of Futrega et al.
  • TransUNet combines a transformer encoder, with a self-attention mechanism, and a classical convolutional neural network decoder
  • UNeXt is a convolutional MLP based network, which uses tokenised MLP blocks with axial shifts

Existing techniques & comparison to ours

  • Continuous U-Net and existing U-Type nets compared in Table 1
  • Existing U-type networks discretise the solution layer by layer, leading to high computational cost
  • Continuous architecture can be solved via the adjoint method, resulting in constant memory cost
  • Dynamic blocks modelled by second-order neural ODEs, not restricted to homomorphic transformations
  • Dynamic blocks are at least twice continuously differentiable, robust to noise
  • Continuous U-Net does not use any additional mechanism

Proposed technique

  • Proposed continuous U-Net with dynamic blocks
  • Modeling network dynamics using second order Neural ODEs
  • Quality measure for tailor-made segmentation tasks
  • Robustness and noise properties of network

Unboxing continuous u-net

  • Continuous U-Net is a continuous approach that avoids computing predictions layer-by-layer.
  • It uses higher order neural ODEs to learn complex flows.
  • Dynamic blocks are based on second order neural ODEs.
  • They can be solved by using the first order adjoint method.
  • This means that only a single point is needed to reconstruct the entire trajectory.
  • This offers a constant memory cost.

Opening the ode-solver box

  • Neural ODEs require a choice of ODE solver
  • Qualitative measures are derived to help with the choice of ODE solver
  • Global Error (GE) is used to measure the accuracy of the numerical solution
  • GE is proportional to the step size (h)
  • Higher order methods converge faster
  • Stability and consistency are required for convergence
  • Euler’s method is simple and always convergent, but not accurate
  • Linear Multistep Method (LMM) has higher convergent rate than Euler’s
  • LMM requires both consistency and zero-stability for convergence
  • Runge-Kutta (RK) method is one-step and very stable
  • RK4 is found to be the best solver for continuous U-Net for segmentation

Continuous u-net: greater and noiseless

  • Second order Neural ODEs are more robust than CNNs
  • Second order ODEs do not intersect, leading to output being bounded by a range
  • Second order ODEs are smoother and less sensitive to noise than first order ODEs
  • Second order ODEs can learn smooth homeomorphisms and better capture the nature of segmentation

Experiments

  • Conducted experiments to validate continuous U-Net model
  • Results of experiments showed model was successful

Data description & evaluation protocol

  • Evaluated continuous U-Net using six medical imaging datasets
  • Datasets varied in image sizes, segmentation masks and dataset sizes
  • Evaluated performance using Dice score, accuracy and Hausdorff distance
  • Used shared code-base for all experiments
  • Used learning rate of 1 x 10-3, step-based learning rate scheduler, RK4 solver, batch size of 16 and trained networks for 500 epochs

Results & discussion

  • Our work is a stand-alone continuous network
  • We achieve significant improvement over most SOTA techniques
  • Our dynamic blocks report a stable and high performance across all datasets
  • Our continuous U-Net requires fewer iterations for the solution than other existing U-type blocks
  • Our continuous U-Net is less sensitive to noise than other CNN-based U-Net architectures
  • Runge-Kutta method outperforms other ODE solvers for segmentation tasks
  • Our continuous U-Net is able to outperform at least two methods per dataset

Conclusion

  • Continuous U-Net is a network modelled by dynamic blocks using second order neural ODEs
  • Continuous U-Net outperforms existing U-Net blocks on six benchmarking datasets
  • Continuous U-Net competes or outperforms famous U-Net architectures with additional mechanisms
  • Continuous U-Net is less sensitive to noise than other block types
  • Fourth-order Runge-Kutta (RK4) solver performs best for BUCSI dataset
  • Implicit Adams-Bashforth-Moulton (ABM) solver performs best for GlaS dataset
  • Continuous U-Net performs as good as or better than state-of-the-art U-Net architectures with additional mechanisms
  • Results are displayed in figures 2, 3 and 4 and tables 7, 8 and 9