Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Neuromorphic computing uses lessons from biology to design computer architectures.
  • Event-based scalable learning has been an elusive goal in large-scale systems.
  • EventProp algorithm is used with BrainScaleS-2 analog neuromorphic hardware.
  • Gradient estimation is improved by one order of magnitude.
  • Results verify correctness of estimation and are used in a low-dimensional classification task.
  • EventProp algorithm could enable scalable, energy-efficient, event-based learning in large-scale analog neuromorphic hardware.

Paper Content

Neuromorphic hardware

  • Variety of neuromorphic hardware architectures proposed
  • Digital systems rely on numerical calculations, physical models use analog or physical properties of a substrate
  • BSS-2 is a research platform based on mixed-signal neuromorphic system using CMOS technology
  • Emulates 512

Advantages of event-based training

  • Membrane voltage of neuron circuits is digitized and stored with a temporal resolution of 2 µs
  • Energy consumption and memory use is significant burden
  • Computational overhead affects training speed
  • External memory bandwidth and capacity can be limiting factor
  • Proposed algorithm is more information efficient than surrogate gradient approach

Hardware training results on the yin-yang dataset

  • Yin-Yang dataset used to demonstrate learning algorithm on BrainScaleS-2
  • Dataset is two-dimensional classification task with three classes
  • Results reported in Table 1 and Fig. 3
  • Hyperparameters and training details in Section 6.4 and Table 3
  • Comparable performance in simulation and 1.5% higher accuracy with EventProp in hardware-in-the-loop approach
  • 6.2 ± 0.5 improvement in memory efficiency using EventProp

Hardware gradient estimation

  • Previous work showed that analytical formula for spike time of LIF neurons can be used for hardware in-the-loop training
  • Tested whether gradient estimate using this method would match analytical estimate
  • Experiment setup with one LIF neuron receiving one input spike with weight w at time t = 0
  • Mean of estimated gradient of loss function agrees well with analytical prediction

Discussion

  • Gradient estimation algorithm for analog neuromorphic hardware requires only spike observations
  • No assumptions on network topology or loss function
  • Enables scalable gradient estimation in large-scale neuromorphic hardware
  • EventProp algorithm used with PyTorch implementation and time-discretized forward-and adjoint dynamics
  • Suitable for digital neuromorphic hardware
  • Future work to demonstrate algorithm on tasks, scalability, and other neuron configurations
  • Fully on-device implementation possible

Adjoint sensitivity analysis with jumps

  • BrainScaleS-2 system can emulate AdEx neuron model
  • Adjoint-sensitivity analysis with jumps used to derive exact gradients
  • Loss function can depend on spike times and voltage traces
  • Dynamical equations of model given by matrix form
  • Adjoint equations and parameter gradients can be computed

Software framework

  • Software stack translates high-level SNN experiment description to data flow graph representation
  • Calibration routines consider user-defined model parameters
  • Training affects digital weight parameters
  • BrainScaleS-2 hardware substrate supports fixed-sign 6 bit synapses
  • Batched input spikes injected into BrainScaleS-2 and SNN emulated for 38 µs per batch entry
  • Spike events and neuron membrane traces recorded to FPGA DRAM
  • Membrane samples expressed on equidistant time grid by linear interpolation
  • Spike recordings mapped to boolean tensor on same time grid

Numerical gradient estimate

  • Discretize forward and adjoint dynamics using explicit Euler integration scheme
  • Dynamics of LIF neurons with exponential-shaped, current-based synapses computed in simulation or injected from observations
  • Custom torch.autograd.Function handles dataflow and backpropagation to synaptic weights
  • Gradient estimation for a layer of LIF neurons described in Algorithm 1
  • Estimate synaptic currents by numerically integrating equation while assuming ideal dynamics on hardware

Training details

  • Translated four values into spike times
  • Added bias spike
  • Mapped events onto boolean tensor
  • Feed-forward network with 120 LIF neurons and 3 LI neurons
  • Loss composed of two terms
  • Repeated input per sample five times
  • Scaled weight gradients with factor 1/τ s

B. spike time decoder

  • Custom torch.autograd.Function converts boolean tensors holding spikes into spike times
  • Spike count adjusts how many of each neurons spikes are retrieved
  • If a neuron does spike fewer times than specified, spike times are set to floating-point infinity
  • Backpropagation happens by injecting the gradient with respect to a spike time at the corresponding position
  • EventPropSynapse returns product of input spikes and weight w
  • EventPropNeuron returns hardware observations or computes forward trajectories in simulation
  • In backward pass, adjoint dynamics are computed using stored spikes from forward-call
  • Stacked tensors consisting of τsλI and (λI − λV) are returned
  • EventPropSynapse backpropagates τsλ I zpre and (λI − λV)w
  • Figure 1 illustrates ITL gradient optimisation approach for analog neuromorphic hardware
  • Figure 2 shows forward and adjoint dynamics of a LIF neuron
  • Figure 3 shows example points of Yin-Yang dataset
  • Figure 4 shows experiment setup, spike time and estimated gradient
  • Figure 5 shows spike time and estimated gradient for a loss function
  • Figure 6 shows dataflow occurring during hardware-in-the-loop training