Zero-Shot Relation Extraction via Reading Comprehension

Task

Relation extraction systems populate knowledge bases with facts from an unstructured text corpus. When the type of facts (relations) are predefined, one can use crowdsourcing or distant supervision to collect examples and train an extraction model for each relation type. However, these approaches are incapable of extracting relations that were not specified in advance and observed during training. Our task focuses on extracting facts of new types that were neither specified nor observed a priori.

Full Paper (PDF)

In our paper, we show that it is possible to reduce relation extraction to the problem of answering simple reading comprehension questions. We map each relation type R(x,y) to at least one parametrized natural-language question q_x whose answer is y. For example, the relation educated_at(x,y) can be mapped to "Where did x study?" and "Which university did x graduate from?". Given a particular entity x ("Turing") and a text that mentions x ("Turing obtained his PhD from Princeton"), a non-null answer to any of these questions ("Princeton") asserts the fact and also fills the slot y. Here are a few examples:

Relation	Question Template
	Where did x graduate from?
educated_at(x,y)	In which university did x study?
	What is x's alma mater?
	What did x do for a living?
occupation(x,y)	What is x's job?
	What is the profession of x?
	Who is x's spouse?
spouse(x,y)	Who did x marry?
	Who is x married to?

This reduction allows us to perform zero-shot learning: define new relations "on the fly", after the model has already been trained. More specifically, the zero-shot scenario assumes access to labeled data for N relation types. This data is used to train a reading comprehension model through our reduction. However, at test time, we are asked about a previously unseen relation type R_N+1. Rather than providing labeled data for the new relation, we simply list questions that define the relation's slot values. Assuming we learned a good reading comprehension model, the correct values should be extracted.

From the perspective of reading comprehension, the goal of our task is to promote the development of new question answering models that are able to generalize to new relations.

Data

All our data is publicly available. Once unpacked, the files are tab-delimited textual (UTF-8) files. The columns represent: the Wikidata relation, the Wikipedia entity (x), the question template, and the sentence. All subsequent columns (fifth onward) are answer spans within the sentence. If there are only four columns, the instance is a negative example; i.e. the question cannot be answered from the sentence.

Zero-Shot Benchmark (493MB) - Tests whether a system can generalize to unseen relations. Contains 10 folds of train/dev/test sets, split and stratified by relation. This is the main benchmark.

All Benchmarks (1.1GB) - Contains two other (easier) benchmarks: unseen entities, unseen question templates.

Full Dataset (1.3GB) - The original set of positive and negative examples.

Code

Evaluation Script - python evaluate.py <test_set> <answer_file> reads the test set and the model's answers, and returns the Precision/Recall/F1 scores. Expects the answer file to contain one answer (UTF-8 string) per row, corresponding to the test set.

BitBucket Repository - Contains our reading comprehension model.