Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Pre-trained language models are effective for natural language processing tasks, but not for low-resource domains due to the domain gap.
SwitchPrompt is a novel and lightweight prompting methodology to bridge the domain gap.
SwitchPrompt uses domain-specific keywords with a trainable gated prompt to offer domain-oriented prompting.
Few-shot experiments on three text classification benchmarks demonstrate the efficacy of the general-domain pre-trained language models when used with SwitchPrompt.
SwitchPrompt can increase accuracy by up to 10.7%, reducing the need for domain-specific language model pre-training.

Paper Content

Introduction

Pre-trained language models (LMs) have been shown to be effective for natural language processing tasks, especially in low-resource settings
Most publicly available LMs are trained on general-domain corpora, which can lead to a domain gap when applied to tasks from a special domain
Pre-training deep language models requires large amounts of text data, which may not be available in low-resource domains
Traditional prompting techniques may not be effective in low-resource settings
SwitchPrompt is a novel and lightweight method to effectively retrieve domain-specific knowledge from pre-trained LMs
SwitchPrompt outperforms different state-of-the-art prompting methods and reduces domain gaps
SwitchPrompt is especially suitable for low-resource settings as it does not require pre-training domain-specific LMs or fine-tuning LMs for the downstream task

Method

Introduces SwitchPrompt
Example of architecture in which it can be applied
Underlying pre-trained language model is fixed

Domain-specific soft prompts

Proposed prompts allow model to switch between general-domain and domain-specific prompts
Sigmoid-based gating function used to control switching
General-domain prompt is a sequence of randomly initialized vectors
Domain-specific prompt incorporates sequence of vectors representing domain-specific keywords
Second gate used to control order of concatenation of general and domain-specific prompts

Prompting architecture

Proposed method is a new definition of soft prompts that can be integrated into any existing model.
Experiments use P-Tuning v2 architecture due to its high efficacy.
P-Tuning v2 is an adaptation of deep prompt tuning.
Soft prompts are injected at every layer of the pre-trained LM.
During training, the prompts are tuned but the LM stays fixed.
Classification head is added on top of the pre-trained LM.

Experiments

Described setup of datasets, training details and baselines
Presented results of experiments

Datasets

Used classification benchmark datasets from different domains: TREC, GARD, SOFC-Exp
Constructed few-shot datasets by randomly sampling N shots per class
Created few-shot development sets by keeping the number of shots in the training and development sets in sync
Used accuracy (%) as evaluation metric

Training details

Used open-sourced HuggingFace language models
Trained models with batch size of 32, max sequence length of 128, dropout rate of 0.1
Used ExponentialLR learning rate scheduler with gamma value of 0.95 and Adam optimizer
Performed experiments on V 100 GPU
Reported results are average of five runs

Baselines

Compared method to different baselines
Used general-domain and domain-specific language models

Results

Prompting methods outperform fine-tuning in low-resource domains
Domain-specific LMs outperform general-domain LMs
SwitchPrompt outperforms other prompting methods
SwitchPrompt reduces the performance gap between general-domain and domain-specific LMs
P-tuning outperforms SwitchPrompt in very-few-shot settings
SwitchPrompt outperforms fine-tuning and other prompting methods in general domain

Analysis

Ablation study shows importance of components of prompting function
Domain-specific keywords are automatically computed
Training time is reduced compared to alternative approaches
Qualitative error analysis shows errors when input sentences convey little domain-specific information

Conclusion

Proposed a new methodology called SwitchPrompt
Domain-specific keywords and gates to retrieve domain-specific knowledge
Outperforms baseline methods in few-shot and all-data settings
Reduces performance gap between general-domain and domain-specific language models

Link to paper#

Abstract#

Paper Content#

Introduction#

Method#

Domain-specific soft prompts#

Prompting architecture#

Experiments#

Datasets#

Training details#

Baselines#

Results#

Analysis#

Conclusion#

Link to paper

Abstract

Paper Content

Introduction

Method

Domain-specific soft prompts

Prompting architecture

Experiments

Datasets

Training details

Baselines

Results

Analysis

Conclusion