Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Neuromorphic computing uses lessons from biology to design computer architectures.
- Event-based scalable learning has been an elusive goal in large-scale systems.
- EventProp algorithm is used with BrainScaleS-2 analog neuromorphic hardware.
- Gradient estimation is improved by one order of magnitude.
- Results verify correctness of estimation and are used in a low-dimensional classification task.
- EventProp algorithm could enable scalable, energy-efficient, event-based learning in large-scale analog neuromorphic hardware.
Paper Content
Neuromorphic hardware
- Variety of neuromorphic hardware architectures proposed
- Digital systems rely on numerical calculations, physical models use analog or physical properties of a substrate
- BSS-2 is a research platform based on mixed-signal neuromorphic system using CMOS technology
- Emulates 512
Advantages of event-based training
- Membrane voltage of neuron circuits is digitized and stored with a temporal resolution of 2 µs
- Energy consumption and memory use is significant burden
- Computational overhead affects training speed
- External memory bandwidth and capacity can be limiting factor
- Proposed algorithm is more information efficient than surrogate gradient approach
Hardware training results on the yin-yang dataset
- Yin-Yang dataset used to demonstrate learning algorithm on BrainScaleS-2
- Dataset is two-dimensional classification task with three classes
- Results reported in Table 1 and Fig. 3
- Hyperparameters and training details in Section 6.4 and Table 3
- Comparable performance in simulation and 1.5% higher accuracy with EventProp in hardware-in-the-loop approach
- 6.2 ± 0.5 improvement in memory efficiency using EventProp
Hardware gradient estimation
- Previous work showed that analytical formula for spike time of LIF neurons can be used for hardware in-the-loop training
- Tested whether gradient estimate using this method would match analytical estimate
- Experiment setup with one LIF neuron receiving one input spike with weight w at time t = 0
- Mean of estimated gradient of loss function agrees well with analytical prediction
Discussion
- Gradient estimation algorithm for analog neuromorphic hardware requires only spike observations
- No assumptions on network topology or loss function
- Enables scalable gradient estimation in large-scale neuromorphic hardware
- EventProp algorithm used with PyTorch implementation and time-discretized forward-and adjoint dynamics
- Suitable for digital neuromorphic hardware
- Future work to demonstrate algorithm on tasks, scalability, and other neuron configurations
- Fully on-device implementation possible
Adjoint sensitivity analysis with jumps
- BrainScaleS-2 system can emulate AdEx neuron model
- Adjoint-sensitivity analysis with jumps used to derive exact gradients
- Loss function can depend on spike times and voltage traces
- Dynamical equations of model given by matrix form
- Adjoint equations and parameter gradients can be computed
Software framework
- Software stack translates high-level SNN experiment description to data flow graph representation
- Calibration routines consider user-defined model parameters
- Training affects digital weight parameters
- BrainScaleS-2 hardware substrate supports fixed-sign 6 bit synapses
- Batched input spikes injected into BrainScaleS-2 and SNN emulated for 38 µs per batch entry
- Spike events and neuron membrane traces recorded to FPGA DRAM
- Membrane samples expressed on equidistant time grid by linear interpolation
- Spike recordings mapped to boolean tensor on same time grid
Numerical gradient estimate
- Discretize forward and adjoint dynamics using explicit Euler integration scheme
- Dynamics of LIF neurons with exponential-shaped, current-based synapses computed in simulation or injected from observations
- Custom torch.autograd.Function handles dataflow and backpropagation to synaptic weights
- Gradient estimation for a layer of LIF neurons described in Algorithm 1
- Estimate synaptic currents by numerically integrating equation while assuming ideal dynamics on hardware
Training details
- Translated four values into spike times
- Added bias spike
- Mapped events onto boolean tensor
- Feed-forward network with 120 LIF neurons and 3 LI neurons
- Loss composed of two terms
- Repeated input per sample five times
- Scaled weight gradients with factor 1/τ s
B. spike time decoder
- Custom torch.autograd.Function converts boolean tensors holding spikes into spike times
- Spike count adjusts how many of each neurons spikes are retrieved
- If a neuron does spike fewer times than specified, spike times are set to floating-point infinity
- Backpropagation happens by injecting the gradient with respect to a spike time at the corresponding position
- EventPropSynapse returns product of input spikes and weight w
- EventPropNeuron returns hardware observations or computes forward trajectories in simulation
- In backward pass, adjoint dynamics are computed using stored spikes from forward-call
- Stacked tensors consisting of τsλI and (λI − λV) are returned
- EventPropSynapse backpropagates τsλ I zpre and (λI − λV)w
- Figure 1 illustrates ITL gradient optimisation approach for analog neuromorphic hardware
- Figure 2 shows forward and adjoint dynamics of a LIF neuron
- Figure 3 shows example points of Yin-Yang dataset
- Figure 4 shows experiment setup, spike time and estimated gradient
- Figure 5 shows spike time and estimated gradient for a loss function
- Figure 6 shows dataflow occurring during hardware-in-the-loop training