Linguistic Features for Relation Characterization
Almost, all the feature based relation extraction methods extract different features based on the syntactic and semantic analysis. Basically, these analyses are performed at the sentence level. The syntactic analysis focuses on the grammatical representa-tion of a sentence. On the other hand, semantic analysis emphasizes on understanding the meaning of a sentence.
Syntactic dependency expresses the grammatical relationship among the words in a sentence. Moreover, syntactic dependency path between two related words indicates the structure of expressing a relation.
Usually, a relation between two entities is expressed in a shorter context. There-fore, shortest dependency path has been proven effective for kernel based relation extraction (Bunescu and Mooney, 2005; Zhang et al., 2006). Neural network based relation classification methods (Cai, Zhang, and Wang, 2016; Liu et al., 2015) used syntactic dependency labels for capturing features in the shortest path automatically.
However, Zhou et al. (2007) argued that in many cases shortest path trees can-not capture enough information for extracting relations. They proposed a context-sensitive shortest path to include necessary information outside the shortest path. In order to capture useful context, Culotta and Sorensen (2004) proposed smallest common subtree and Chowdhury, Lavelli, and Moschitti (2011) proposed minimal subtree for extracting relations.
Consecutive dependency labels in the shortest path between two related entities make a pattern of a relation. Such patterns could be useful for trigger-independent relation extraction. Several patterns have been studied for extracting relation from texts. Pershina et al. (2014) extracted dependency patterns of different relations where maximum pattern length of 3 was found most effective. A SVO pattern has been used by Snow, Jurafsky, and Ng (2005) for extracting hypernym relations.
Expression of a relation often includes a verb between the subject and object. Parts-of-speech (POS) tags provide useful information to identify syntactic roles of the words in a sentence. Therefore, POS-tags have been widely used in pattern based (Hearst, 1992; Fader, Soderland, and Etzioni, 2011), kernel based (Nguyen, Mos-chitti, and Riccardi, 2009) and feature based (Mintz et al., 2009; Surdeanu et al., 2012) relation extraction. (Mintz et al., 2009) used windows of POS-tags between two arguments, before the first argument and after the second argument. Subgraphs of syntactic parse trees with POS-tags have been explored by (Jiang and Zhai, 2007) for extracting relation at the sentence level.
The syntactic analysis provides useful information for learning grammatical struc-tures of relation expression. However, in most cases, characterizing the semantic type of a relation is not possible by such information.
Semantic analysis facilitates understanding the meaning of a text. Since a relation is usually expressed in a sentence, semantic analysis is important to interpret what type of relation it mentions.
Words between and around the mentions hold useful information to characterize the relation type specially for trigger-dependent relations. Therefore, such lexical information has been widely used for learning relation types. Kambhatla (2004), Jiang and Zhai (2007) and Mintz et al. (2009) used words between the arguments of a relation. Moreover, the first word before the first argument and the first word after the second argument have been taken into account in addition to the words between the arguments by GuoDong et al. (2005). Mintz et al. (2009) included windows of k-words before the first argument and after the second argument in their distant super-vision based relation learning model. Such windows of words have been inherited in some other studies (Riedel, Yao, and McCallum, 2010; Riedel, Yao, and McCallum, 2010; Hoffmann et al., 2011; Surdeanu et al., 2012).
Traditional systems learn which words are useful for identifying the type of a relation. These systems cannot handle unseen and sparse words in the training data. However, neural network based relation classification methods (Zeng et al., 2014; Nguyen and Grishman, 2015) take the sequence of words as input and perform semantic analysis based on word embeddings. Word embeddings facilitate charac-terizing the semantic type of a relation by computing semantic similarity between words.
Collective and Statistical Analysis for Relation Extraction
Linguistic analysis is important for extracting relation at the sentence level. Relation-ship between two entities also depends on their co-existence and common resources between them. Such information cannot be explored by linguistic analysis. In rela-tion validation task, corpus level studies e.g. co-occurrences of two entities and their sharing resources can be taken into account which we call collective analysis.
Collection level information has been explored for improving the performance of relation extraction by learning the boundaries of relation arguments (Rosenfeld and Feldman, 2007b). Augenstein (2016) has taken into account global information about the object of a relation such as object occurrence, markup link with the object, title of the document containing the object etc. for web relation extraction.
The statistical analysis gets importance for extracting relation in a collective man-ner. Niu et al. (2012) performed statistical inference on diverse data for learning rela-tion. A probabilistic model of inference has also been explored by (Fang and Chang, 2011). Such model counts co-occurrences of the subject-object pairs, frequencies of the relational tuples and patterns and their probabilities. Co-occurrence context has also been quantified by measuring mutual information for extracting relation between entities in the web (Xu et al., 2014).
Ensemble Learning for Relation Validation
The output of a single learning system is affected by statistical, computational and representational problems that can be partly overcome by an ensemble learning (Di-etterich, 2002). Therefore, outputs of different relation extraction systems can be aggregated for further processing as an ensemble manner.
An ensemble method makes some decisions or generates output by performing some analysis on the outcomes of multiple systems. Ensemble methods have been well studied in machine learning. Bagging (Breiman, 1996), stacking (Wolpert, 1992), boosting (Freund and Schapire, 1995) etc. are widely used ensemble machine learn-ing algorithms where bagging and boosting basically count majority votes of the outcomes by different learning algorithms trained on different subsets (bootstrapped samples) of the training data. However, in stacking outputs of the first round bootstrap-trained algorithms are fed to train an algorithm at the second round to learn the final outcome.
Voting is another kind of ensemble method which takes into account agreements among the outputs of different systems. Such methods have been studied in many decision making tasks (Polikar, 2006; Boroushaki and Malczewski, 2010; Morais and Almeida, 2012; Cao et al., 2012) Ensemble methods have been successfully used in many information extraction tasks. Yang et al. (2010) explored using ensemble methods to solve various bioinfor-matics problems such as identifying the interaction between genes, predicting regula-tory elements from DNA and protein sequences etc and for identifying effective fea-tures. In order to recognize spoken emotions, Morrison, Wang, and De Silva (2007) employed both stacking and voting based ensemble methods. Moreover, Jean-Louis, Besançon, and Ferret (2011) performed majority voting on the outputs of three dif-ferent methods for template filling task. Their voting method obtained better result compared to any individual method.
Slot Filler Validation
In the series of KBP workshop, every year TAC operates several tracks focusing on different problems. Slot filler validation (SFV)4 is one of the tracks which empha-sizes on validating relations for improving slot filling scores. Basically, a SFV system examines whether a response of a slot filling system holds any evidence to justify a claimed relation in the response. Thus SFV implies the task of relation validation.
In slot filling task, a system generates outputs regarding some given queries. When multiple systems respond to the same set of queries, the system outputs can be analyzed to select the correct responses by validating them in an ensemble fashion. Viswanathan et al. (2015) used stacking classifier to improve the KBP slot filling per-formance. They trained the classifier by the offsets and confidence scores of the re-sponses of different slot filling systems. A comparative study of different models for slot filling was done by Adel, Roth, and Schütze (2016) which found a combination of state-of-the-art and neural network models achieves a higher score than any single model. Therefore, outcomes of different slot filling systems have been taken into account by Wang et al. (2013) and Sammons et al. (2014) for SFV task. Sammons et al. simply counted majority voting of the fillers for the same purpose and achieved F-score between 45:70 and 48:0. In contrast, Wang et al. used confidence scores of the responses by different slot filling systems to solve a constraint optimization prob-lem for validating the responses. Moreover, contributions of the systems, referenced documents and filler values corresponding to a slot filling query have been taken into account by Yu et al. (2014). They employed a multi-dimensional truth finding model to compute the credibility of a system, document and filler value. In their method, some linguistic indications also have been used such as filler type and inspection of trigger words and dependency path length in the responded relation justifying text etc.
A bipartite graph-based consensus maximization (BGCM) method has been pro-posed by Rodriguez, Goldberg, and Wang (2015) that combines the outputs of super-vised stacked ensemble methods and slot filling runs. This method outperforms all other ensemble methods and the best slot filling run on 2015 KBP slot filling dataset. This method has been extended in Rodrıguez and Wang (2016) where consensus maximization technique is employed over multiple knowledge bases. This two SFV methods did not use any linguistic information but achieved better score compared to the best SF systems.
The performances of some top ranked SFV ensemble systems are shown in Ta-ble 2.5. The highest F-score of 61:72 was achieved on KBP-2013 dataset. However, the best SFV F-scores on KBP-2014, KBP-2015 and KBP-2016 datasets were 48:00, 34:83 and 32:42 accordingly. On the other hand, the best F-scores of different SF sys-tems on the same datasets (KBP-2013 to KBP-2016) were 37:28, 36:72, 28:75 and 27:03 accordingly. Although SFV systems are improving the scores over the SF sys-tems, the scores of SFV systems are also decreasing over the years which indicates that SFV is also a very challenging task.
Graph based Methods for Relation Validation
Mentions of entities in a collection of texts can be represented in a graph according to their co-existences in texts and sharing resources between them. Such graph facili-tates to explore how the related entities are associated and what common information they share.
Table of contents :
1.1 Research Objective
2 Literature Review
2.1 Slot Filling Task
2.2 Slot Filling Systems
2.3 Relation Extraction
2.3.1 Relation Extraction Methods
2.3.2 Linguistic Features for Relation Characterization
2.3.3 Collective and Statistical Analysis for Relation Extraction .
2.4 Relation Validation
2.4.1 Ensemble Learning for Relation Validation
2.4.2 Graph based Methods for Relation Validation
3 Entity Graph and Measurements for Relation Validation
3.1 Graph Definition
3.2 Entity Graph and Graph Database
3.3 Graph Construction
3.4 Measurements on Graph
3.4.1 Node Centrality
3.4.2 Mutual Information
3.4.3 Network Density
3.4.4 Network Similarity
3.5 Relation validation by Graph Analysis
4 Linguistic Characteristics of Expressing and Validating Relations
4.1 Linguistically Motivated Classification of Relation
4.2 Syntactic Modeling
4.2.1 Syntactic Dependency Analysis
4.2.2 Dependency Patterns and Edit Distance
4.3 Lexical Analysis
4.3.1 Trigger Word Collection
4.3.2 Word Embeddings
4.3.3 Recognition of Trigger Words
4.4 Syntactic-Semantic Fusion
4.5 Evaluation of Word-embeddings
5 Relation Validation Framework
5.1 Relation Validation Model
5.1.1 Relation Validation Features
5.1.2 Relation Validation System Overview
5.2 Corpus and Preprocessing
5.2.1 KBP Slot Filling Corpora
5.2.2 KBP Slot Filling Responses and Snippet Assessments
5.3 Evaluation Metrics
6 Experiments and Results
6.1 Participation to TAC KBP-2016 SFV Task
6.1.1 Evaluation of Different Feature Groups
6.1.2 Relation Validation Models for KBP-2016 SFV Task
6.2 System Investigation
6.2.1 Statistical Difference Between TAC KBP Evaluation Datasets in 2015 and 2016
6.2.2 Impact of the Trustworthy Features
6.2.3 Impact of Trigger Words in the Slot Filling Responses
6.2.4 Identifying the Reason of Failure to Compute Graph Features
6.2.5 Conclusion and Plans for Improving the System
6.3 Supervised Relation Validation and Knowledge Base Population
6.3.1 Enlarging the Training and Testing Datasets
6.3.2 Relation Validation Models
6.3.3 Knowledge Base Population by Employing Relation Validation Models
6.4 An Experiment of Unsupervised Relation Validation and Knowledge Base Population
6.4.1 PageRank Algorithm
6.4.2 Graph Modeling
7 Conclusion and Future Work
7.2 Future Work