Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Language model-based KG embeddings are usually static and difficult to modify after deployment.
A new task of editing language model-based KG embeddings is proposed.
Four new datasets are built to evaluate existing models and a new model, KGEditor.
KGEditor can update specific facts without affecting the rest with low training resources.

Paper Content

Introduction

Knowledge Graphs (KGs) are multi-relational graphs with massive symbolic facts
KGs can provide back-end support for knowledge-intensive tasks
KG embedding approaches represent KGs in low-dimension vector spaces
Traditional KG embedding models are structure-based
Recent trend is to apply text descriptions with an expressive black-box model
New task of editing language model-based KG embeddings proposed
Three principles to evaluate performance of task: knowledge reliability, locality, efficiency
Four new datasets built for EDIT and ADD sub-tasks
Existing approaches suffer from limited ability to efficiently edit KG embeddings
Proposed approach KGEditor can modify incorrect knowledge or add new knowledge while maintaining the others

Editing factual knowledge

Editable training is an early model-agnostic attempt to quickly edit a trained model.
There are two different kinds of methodologies: external model-based editor and additional parameter-based editor.
Current approaches mainly aim to modify the knowledge in pre-trained language models.

Task definition

KG can be represented as (entities, relations, triples)
EDIT sub-task: change outdated triple to new triple (head, relation, tail, target)
ADD sub-task: insert new triple into KG embeddings
Evaluation metrics: knowledge reliability, knowledge locality, knowledge efficiency

Datasets construction

Built four datasets for EDIT and ADD sub-task based on two benchmark datasets
Trained KG embedding models with language models and sampled hard triples as candidates
Removed triples that can be inferred from language models
Sampled long-tailed triples to handle skewed distribution
For EDIT sub-task, replaced entities in training set of FB15k-237 to build pre-train dataset
For ADD sub-task, used original training set of FB15k-237 to build pre-train dataset
Built reference dataset based on link prediction performance with rank value below k
All data reviewed by five human experts for ethical issues

Language model-based kge

FT-KGE methods use triples in KGs as textual sequences and use the presentation of [CLS] token to conduct a binary classification
PT-KGE methods use natural language prompts to elicit knowledge from pre-trained language models for KG embeddings
Link prediction can be reformulated as a masked entity prediction task

Editing kge baselines

KG embeddings can be edited using external model-based editors and additional parameter-based editors
KE uses a hyper network to predict weight updates during inference
MEND uses a low-rank decomposition of gradients to edit language models
CALINET extends the FFN with additional parameters for knowledge editing
KGEditor combines external model-based and additional parameter-based editors to edit KG embeddings
KGEditor uses an additional FFN layer and a hyper network to generate shift parameters
The hyper network is a bidirectional LSTM that encodes input triples
The FNN predicts vectors and a scalar to generate shift parameters
KGEditor is efficient in terms of computational resources and time

Evaluation

Settings

Utilize four datasets for evaluation
Initialize KG embeddings with BERT base
Use train and test datasets for evaluation
Evaluate ADD sub-task based on performance of train set
Leverage PT-KGE as default setting in main experiments

Main results

KGEditor is compared to baselines and variants
Experiments assess effectiveness of various approaches to edit KG embeddings
Performance is verified with different number of target triples
Performance of different KGE initialization is explored
KGE ZSL fails to infer any facts on both datasets
Tuning all or partial parameters does not achieve satisfactory performance
KGEditor outperforms almost all baselines
KGE ZSL still fails to infer any facts on both datasets
Tuning all or partial parameters yields best performance but loses previously learned knowledge
KGEditor utilizes hyper network to guide parameter editing of FFN
KGEditor obtains better performance than baselines
Number of edits affects all models’ knowledge reliability
Visualization of entities before and after editing shows effectiveness of editing KG embeddings

Discussion and conclusion

Propose two sub-tasks of EDIT and ADD for KG embeddings
Aim to enable fast updates to KG embeddings after deployment
Related to studies of language models as knowledge bases
Contribute a new task with four datasets
Propose a simple yet strong baseline KGEditor
Future plans to develop more efficient approaches to edit more knowledge in one stage

Link to paper#

Abstract#

Paper Content#

Introduction#

Editing factual knowledge#

Task definition#

Datasets construction#

Language model-based kge#

Editing kge baselines#

Evaluation#

Settings#

Main results#

Discussion and conclusion#

Link to paper

Abstract

Paper Content

Introduction

Editing factual knowledge

Task definition

Datasets construction

Language model-based kge

Editing kge baselines

Evaluation

Settings

Main results

Discussion and conclusion