Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Language models contain a significant body of factual knowledge.
- There is currently no mechanism for representing this knowledge.
- We propose a procedure for extracting a knowledge-graph from a language model.
- The procedure is composed of sub-tasks and designed prompts.
- Evaluation shows high precision (82-92%) and reasonable number of facts.
Paper Content
Introduction
- Modern language models are trained on vast amounts of text that captures human knowledge
- Language models can be viewed as knowledge-bases
- Representing the knowledge in a language model as an explicit knowledge graph is the challenge addressed in this paper
- A knowledge graph is a graph of entities and relations between them
- The goal is to uncover the knowledge-base of a given language model
- The approach decomposes the problem into multiple sub-tasks
- The approach is evaluated with GPT-3, leading to high-precision graphs
- The approach can generate facts outside the schema of WIKIDATA
- Contributions are formulating the problem, presenting a prompt-based approach, and evaluating the approach with GPT-3
Crawling kgs via prompting
- Generate relations for an entity
- Find objects for each relation
- Improve recall with paraphrasing
- Pool results to construct final graph
Relation generation
- Our task is to generate a set of relations for a given subject entity.
- We leverage WIKIDATA to construct in-context examples.
- We extract the set of WIKIDATA relations for each entity.
- We concatenate the target entity to the in-context examples.
- The output of the language model is used as the set of relations.
Relation paraphrasing
- A relation can be described in multiple ways.
- A procedure is used to obtain a set of paraphrases of the relation.
- In-context examples are not necessary for relation paraphrasing, an instruction prompt is sufficient.
Object generation
- Goal is to generate a set of objects for each relation in a knowledge graph
- Table 1 provides a list of sub-tasks and their corresponding query, prompt, and expected output
- “Pure Object Generation” prompt does not include “Don’t Know” output
- Entities and relations are taken from WIKIDATA
- Target entity-relation pair is concatenated to the examples
- High precision is desired, so “Don’t Know” is prompted when the model is likely to make an error
- Half of the examples in the “DK Object Generation” prompt are failure cases with “Don’t Know” and the other half are correct predictions
Experimental setup
- Used WIKIDATA to construct in-context prompts and select seed entities
- Split seed entities into validation and test sets
- Validation set used to make design choices
- Test set used for final evaluation
- Development set manually chosen from WIKIDATA
- 25 specific world-entity categories used to construct test set
- Evaluation metrics used to measure precision and recall
- OpenAI text-davinci-002 model used as LM
- Greedy decoding and sampling used
- Model variants include ‘Relation Generation’, ‘Pure Object Generation’, ‘DK Object Generation’, ‘Subject Paraphrasing’ and ‘Relation Paraphrasing’
Results
- Expansion method generates meaningful knowledge subgraphs
- Example graph shows all facts are sensible
- Results on main test set show precision of Pure-Greedy is too low to be useful, precision with LMCRAWL is much higher
- Results on head test set show precision of 91.5% with LMCRAWL for 1-hop graphs, 90.9% for 2-hop
- Don’t Know generation improves precision but reduces number of generated facts
- Paraphrasing increases coverage with minimal hit to precision
- Frequency of entities on the web is reflected in number of facts extracted
- Automatic approach for evaluating precision uses Google search
- 4.1% of triplets labeled as correct are actually wrong, 22% of triplets labeled as incorrect are true
- Pretrained LMs contain rich factual knowledge
- Knowledge-base construction typically involves manual and automated aspects
- Quantifying uncertainty in LMs is a key requirement for LM applicability
- Understanding large LMs is important to avoid factual errors and biases
- Extracting a structured KB from an LM is a key step towards understanding LMs
- Challenges include cost, error propagation, and measuring knowledge coverage