Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Image segmentation is a fundamental task in image analysis and clinical practice.
- Current state-of-the-art techniques are based on U-shape type encoder-decoder networks with skip connections (U-Net).
- U-Net has limitations such as hard coding of the receptive field size, not accounting for inherent noise in the data, problems associated with discrete layers, and no theoretical underpinning.
- Continuous U-Net is a novel family of networks for image segmentation that introduces dynamic blocks modelled by second order ordinary differential equations.
Paper Content
Introduction
- Image segmentation is a fundamental task in medical image analysis and clinical practice
- Manual segmentation is time-consuming and costly
- Deep learning and FCNNs have enabled automatic segmentation
- U-Net is a popular model for biomedical image segmentation
- U-Net has impressive performance but suffers from limitations
- Continuous U-Net is proposed to address these limitations
- Continuous U-Net is faster, more robust, noiseless and has underpinning theory
- Continuous U-Net outperforms existing U-type blocks and competes with other mechanisms
Related work
- Image segmentation is a task used in medical data
- State-of-the-art techniques rely on U-type architectures
- Existing techniques are reviewed in this section
U-type nets: a block based perspective
- U-Net is a model for biomedical image segmentation
- U-Net consists of four components: neural network layers, downsampling, upsampling and concatenation operations
- Different U-type variants use different types of blocks
- ResUNet uses residual blocks
- DenseUNet uses densely connected blocks
- Inception blocks use convolutions with varying kernel sizes
- Pyramid pooling blocks use pooling and convolutions before upsampling
U-type nets with additional mechanisms
- Zhang et al. introduced Re-sUNet, which uses residual blocks, atrous convolutions, pyramid scene parse pooling, and multi-task inference
- Attention U-Net uses attention gates to filter features before concatenating the upsampled input and the skip connection
- DynUNet combines heuristic rules and setting from nnU-Net and the optimisation scheme of Futrega et al.
- TransUNet combines a transformer encoder, with a self-attention mechanism, and a classical convolutional neural network decoder
- UNeXt is a convolutional MLP based network, which uses tokenised MLP blocks with axial shifts
Existing techniques & comparison to ours
- Continuous U-Net and existing U-Type nets compared in Table 1
- Existing U-type networks discretise the solution layer by layer, leading to high computational cost
- Continuous architecture can be solved via the adjoint method, resulting in constant memory cost
- Dynamic blocks modelled by second-order neural ODEs, not restricted to homomorphic transformations
- Dynamic blocks are at least twice continuously differentiable, robust to noise
- Continuous U-Net does not use any additional mechanism
Proposed technique
- Proposed continuous U-Net with dynamic blocks
- Modeling network dynamics using second order Neural ODEs
- Quality measure for tailor-made segmentation tasks
- Robustness and noise properties of network
Unboxing continuous u-net
- Continuous U-Net is a continuous approach that avoids computing predictions layer-by-layer.
- It uses higher order neural ODEs to learn complex flows.
- Dynamic blocks are based on second order neural ODEs.
- They can be solved by using the first order adjoint method.
- This means that only a single point is needed to reconstruct the entire trajectory.
- This offers a constant memory cost.
Opening the ode-solver box
- Neural ODEs require a choice of ODE solver
- Qualitative measures are derived to help with the choice of ODE solver
- Global Error (GE) is used to measure the accuracy of the numerical solution
- GE is proportional to the step size (h)
- Higher order methods converge faster
- Stability and consistency are required for convergence
- Euler’s method is simple and always convergent, but not accurate
- Linear Multistep Method (LMM) has higher convergent rate than Euler’s
- LMM requires both consistency and zero-stability for convergence
- Runge-Kutta (RK) method is one-step and very stable
- RK4 is found to be the best solver for continuous U-Net for segmentation
Continuous u-net: greater and noiseless
- Second order Neural ODEs are more robust than CNNs
- Second order ODEs do not intersect, leading to output being bounded by a range
- Second order ODEs are smoother and less sensitive to noise than first order ODEs
- Second order ODEs can learn smooth homeomorphisms and better capture the nature of segmentation
Experiments
- Conducted experiments to validate continuous U-Net model
- Results of experiments showed model was successful
Data description & evaluation protocol
- Evaluated continuous U-Net using six medical imaging datasets
- Datasets varied in image sizes, segmentation masks and dataset sizes
- Evaluated performance using Dice score, accuracy and Hausdorff distance
- Used shared code-base for all experiments
- Used learning rate of 1 x 10-3, step-based learning rate scheduler, RK4 solver, batch size of 16 and trained networks for 500 epochs
Results & discussion
- Our work is a stand-alone continuous network
- We achieve significant improvement over most SOTA techniques
- Our dynamic blocks report a stable and high performance across all datasets
- Our continuous U-Net requires fewer iterations for the solution than other existing U-type blocks
- Our continuous U-Net is less sensitive to noise than other CNN-based U-Net architectures
- Runge-Kutta method outperforms other ODE solvers for segmentation tasks
- Our continuous U-Net is able to outperform at least two methods per dataset
Conclusion
- Continuous U-Net is a network modelled by dynamic blocks using second order neural ODEs
- Continuous U-Net outperforms existing U-Net blocks on six benchmarking datasets
- Continuous U-Net competes or outperforms famous U-Net architectures with additional mechanisms
- Continuous U-Net is less sensitive to noise than other block types
- Fourth-order Runge-Kutta (RK4) solver performs best for BUCSI dataset
- Implicit Adams-Bashforth-Moulton (ABM) solver performs best for GlaS dataset
- Continuous U-Net performs as good as or better than state-of-the-art U-Net architectures with additional mechanisms
- Results are displayed in figures 2, 3 and 4 and tables 7, 8 and 9