Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Language models used as mutation and crossover operators for evolutionary neural architecture search algorithm
Combination of evolutionary prompt engineering and soft prompt-tuning (EvoPrompting) produces diverse and high performing models
EvoPrompting successful at designing accurate and efficient neural network architectures across a variety of machine learning tasks

Scaling of Transformers has produced language models with impressive performance
Language models can learn how to code, do math, and solve reasoning problems
Limitations of language models in solving complex problems and creating novel solutions
EVOPROMPTING improves ability to propose novel and diverse solutions to complex reasoning problems
EVOPROMPTING uses evolutionary search to create and curate data to improve LM in-context prompting examples
Few-shot prompting with EVOPROMPTING enables LMs to create architectures that outperform those designed by human experts
EVOPROMPTING discovers novel graph neural network architectures that outperform current state-of-the-art

Transformer models are popular for natural language systems
Transformer models can be used to write code, do math, and solve reasoning problems
Brown et al. (2020) demonstrated that LMs can be prompted with in-context examples
Numerous works have used prompting to unlock latent LM abilities
Prompts can be tuned using probabilistic inference techniques
Evolutionary search can be used to design prompts for in-context learning
LM replaces mutation and crossover functions in evolutionary search
LM used as crossover operator to produce variations of text-based genotypes
Deep sequence models have been used to improve machine learning workflows

The target task is denoted by T
D is a dataset consisting of input-output pairs
π θ is a probability distribution over vocabulary V
Code segments can be sampled from π θ
EVAL T is an evaluation function that trains the model architecture given by code c on D
The goal is to identify code samples that maximize the reward EVAL T

Goal of algorithm is to generate set of k neural network architectures that maximize reward
Use black-box evolutionary approach to generate, score, and select best architectures
Evolution works well in this domain because of sparse high quality solutions
Algorithm uses LM for crossover and mutation operations
Search space includes any neural network architecture that can be represented in Python
LM is pre-trained on massive datasets containing source code files
LM can be used as self-adaptive crossover operator
Scoring function is negative product of validation error and model size
Initialize with seed architectures that are known to be well-designed
Create few-shot prompts using source code and evaluation metrics
Use LM to generate n samples per prompt
Apply fitness-based selection to identify top candidate models
Train LM for next round using child models not previously selected

Pre-trained language models (LMs) can be embedded in evolutionary algorithms to improve performance on neural architecture design tasks
EVOPROMPTING can optimize convolutional architectures for MNIST-1D and develop new GNNs for CLRS algorithmic benchmark
EVOPROMPTING can discover novel, competitive, and state-of-the-art architectures that optimize for accuracy and model size
EVOPROMPTING is general enough to be adapted to other kinds of reasoning tasks
EVOPROMPTING is more efficient than other search methods, reaching a maximum fitness with fewer samples
EVOPROMPTING combines evolutionary search with soft-prompt tuning
EVOPROMPTING uses evaluation function to train model and return lowest validation error
EVOPROMPTING uses four seed models for MNIST-1D model search and nine seed models for CLRS model search
EVOPROMPTING uses triplet representations such as MAXMEAN, CONCATREP, TANHEXPANDTRIPLETS, DIV2MEAN, and others
EVOPROMPTING uses fully-connected multi-layer perceptrons and linear layers for node and edge representations