Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Characterization of group equivariant neural networks with layers as tensor power of $\mathbb{R}^{n}$ for 3 symmetry groups
Spanning set of matrices for learnable, linear, equivariant layer functions in standard/symplectic basis of $\mathbb{R}^{n}$
Circumvents requirement of decomposing tensor power spaces of $\mathbb{R}^{n}$ into irreducible representations
Results based on Schur-Weyl dualities established by Brauer in 1937 paper

Finding neural networks that are equivariant to a symmetry group is an active area of research
Requirement for the overall network to be equivariant restricts the form of the neural network
Parameter sharing within each layer results in simpler, more interpretable models
Data from physical processes often comes with a certain type of symmetry
Constructing equivariant neural networks typically involves decomposing tensor product representations
Paper takes a different approach to constructing equivariant neural networks for 3 groups
Approach is motivated by mathematical concept not seen in machine learning literature
Combinatorics underlying Brauer and Brauer-Grood vector spaces provides theoretical background
Finds a spanning set of matrices for learnable, linear, equivariant layer functions
Neural networks are simple to implement
Schur-Weyl duality is a powerful mathematical concept to characterise group equivariant neural networks

Group equivariant neural networks are constructed by alternately composing linear and non-linear G-equivariant maps between representations of a group G.
G-equivariance is defined as the set of all linear G-equivariant maps between two representations of a group G.
G-invariance is a special case of G-equivariance where the map is said to be G-invariant if the representation in the final layer is the 1-dimensional trivial representation of G.
An L-layer G-equivariant neural network is a composition of layer functions such that the lth layer function is a map of representations of G and acts pointwise.
The entire neural network is G-equivariant because the composition of any number of G-equivariant functions is itself G-equivariant.
The groups O(n), SO(n), and Sp(n) are subgroups of GL(n).
The space (R n ) ⊗k is a representation of G and each form on R n induces a non-degenerate bilinear form on (R n ) ⊗k.
Brauer’s 1937 paper focused on calculating a spanning set of matrices for End G ((R n ) ⊗k ) in the standard basis of R n for the groups G = O(n) and SO(n), and in the symplectic basis of R n for Sp(n).
We want to find a spanning set of matrices for Hom G ((R n ) ⊗k , (R n ) ⊗l ) in the standard/symplectic basis of R n.
Brauer associated with each element of the spanning set of invariants a diagram, which is a (k, l)-Brauer partition.
The second type of diagram is a (l + k)\n-diagram.

Assumption that feature dimension for all layers in neural network is one
Results in Section 6 can be adapted for feature dimension greater than one
Substitutions needed to find spanning set for each case
Bias terms can be included in layer functions while keeping them G-equivariant
Need to find spanning set for Hom G (R, (R n ) ⊗l ) using Theorems 6.5, 6.6, and 6.7

Linear layer functions can be constructed to be equivariant to local symmetries.
A spanning set for a Hom-space can be found by taking the image of each basis diagram under its appropriate map and calculating the Kronecker product of the resulting matrices.

Richard Brauer developed the combinatorial representation theory of the Brauer algebra in 1937.
Brown published two papers in the 1950s showing that the Brauer algebra is semisimple if and only if n ≥ k − 1.
Weyl showed that the Brauer algebra is semisimple if and only if n ≥ 2k.
Hanlon and Wales provided an isomorphism between two versions of the Brauer algebra.
Grood investigated the representation theory of the Brauer-Grood algebra.
Lehrer and Zhang studied the kernel of the surjection of algebras.

We showed how the combinatorics of Brauer and Brauer-Grood vector spaces can be used to construct group equivariant neural networks for the orthogonal, special orthogonal, and symplectic groups.
We looked at the problem of calculating the matrix form of the linear layer functions between such spaces in the standard/symplectic basis for R n .
We found a spanning set of matrices for the layer functions in the standard/symplectic basis for R n for each of the three groups.
Each diagram provided the parameter sharing scheme for its image in the spanning set.
We used Schur-Weyl duality between the symmetric group, S n , and the partition algebra, P k (n), to fully characterise all of the possible permutation equivariant neural networks.
We used Theorem 6.5, noting that l + k = 4, which is even, to find a basis of End O(2) ((R 2 ) ⊗2 ).
We used Theorem 6.5, noting that l + k = 6, which is even, to find a basis of End O(3) ((R 3 ) ⊗3 ).
We used Theorem 6.7 to find the basis elements of D 3 3 (3) ⊗ D 2 1 (3).
We used Ψ 3 3,3 ⊗ Ψ 2 1,3 to find a spanning set for Hom SO(3)×SO(3) ((R 3 ) ⊗3 R 3 , (R 3 ) ⊗3 (R 3 ) ⊗2 ).
We used a red demarcation line to separate the vertices of the respective diagrams.
We used the symplectic basis for R n .
We used a table to show the images of the six 4\2-diagrams in D 1 3 (2).