Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Category theory has been applied to various scientific domains.
- DisCoPyro combines categorical structures with amortized variational inference.
- DisCoPyro can be applied to program learning for variational autoencoders.
- DisCoPyro provides mathematical foundations and concrete applications.
- DisCoPyro is compared to other models (e.g. neuro-symbolic models).
- DisCoPyro could contribute to the development of artificial general intelligence.
Paper Content
Introduction
- Category theory has been applied to various mathematical sciences and machine learning.
- DisCoPyro is a probabilistic generative model with amortized variational inference.
- DisCoPyro is designed to develop human-level artificial general intelligence.
Notation
- SMCs are built from objects and sets of morphisms between them
- Operads are built from types and sets of morphisms between them
- SMCs have a product operation and a unit of the product
- Categories support composition of morphisms
- Operads support indexed composition of morphisms
Foundations of discopyro
- Finitary signature in an SMC consists of free operad over a signature and objects in the SMC
- Free operad over a signature consists of trees with finitely many branches and leaves
- Reason about composition as nesting rather than just transitive combination
- Represented as directed acyclic hypergraphs
- Algorithm 1 produces hypergraph with vertices corresponding to non-product type
- Transition distance between two indexed vertices
- Probabilistic generative model over morphisms in the free operad
- Acyclicity requires connections between boxes without forming cycles
- User-specified wiring diagram to write prior distribution over latent variables
- Inference from data by Bayesian inversion
Model learning and variational bayesian inference
- Bayesian inversion relies on evaluating the model evidence
- Model evidence has no closed form solution
- Transform high-dimensional integral into an expectation
- DisCoPyro provides two methods for constructing expectation
- Jensen’s Inequality provides lower bound to true model evidence
- Maximizing evidence lower bound estimates model parameters
Example application and training
- Framework connects morphism to data via likelihood with intermediate latent random variables
- Allows for broad variety of applications
- Example application of framework to deep probabilistic program learning for generative modeling
- Performance of application as generative model described in Section 3.2
Deep probabilistic program learning with discopyro
- Constructed an operad O with generators implementing Pyro building blocks for deep generative models
- Specified one-box wiring diagram to parameterize DisCoPyro generative model
- Trained free operad model on MNIST and downsampled Omniglot dataset
- DisCoPyro provides amortized variational inference over its own random variables
- Generative model provides proposal over morphisms in free operad
- Neural network design for proposal specified as q φ (z | x, f θ )
- Application has complete proposal density
Experimental results and performance comparison
- Our free operad model outperforms other structured deep generative models in terms of log model evidence.
- Older baselines fix a composition structure ahead of time, while our model learns it from data.
- The Omniglot challenge requires a model to be usable for classification, latent feature recognition, concept generation, and exemplar generation.
Discussion
- DisCoPyro is a generative Bayesian structure learning system
- It uses category theory, operad theory, and variational Bayesian inference
- It is competitive against other models on a challenge dataset
- It can model human intelligence across more domains than handwritten characters
- It can be applied to chemical reaction networks, natural language processing, and the systematicity of human intelligence
- It uses an absorbing Markov chain to sample paths between types