Natural Language Processing – Project topics materials

Get Complete Project Material File(s) Now! »

Linguistic Notion of Entailment

The application-oriented notion of textual entailment is related, of course, to a classical logic-based notion of entailment in linguistics. A common definition of entailment in formal semantics specifies that a Text T entails another text H (hypothesis, in our terminology) if H is true in every circumstance (possible world) in which T is true. However, the TE definition allows for cases in which the truth of the hypothesis is highly plausible (“most likely true”), for most practical purposes, rather than certain.
In propositional and predicate logic, entailment (or logical implication) describes a relation between one sentence or a set of sentences – the entailing expressions – represented as formulae of a formal language, and another sentence that is entailed. Formally, given a set of formulae = A₁ , …, A_n and a formula B, we say that semantically entails B( |= B) if and only if every model (or interpretation) of A₁ , …, A_n is also a model of B. The Venn diagram of this relationship is show in Figure 1.1.

Variants of the Entailment

As already mentioned in this thesis, we argue that there are several types of entailment, for example in the study done in (Pazienza et al., 2005),they present three types of entailment can be defined:

Semantic Subsumption – T and H express the same fact, but the situation de-scribed in T is more specific than the situation in H. The specificity of T is ex-pressed through one or more semantic operations. For example in the sentential pair:

H: The cat eats the mouse.
T: The cat devours the mouse.

T is more specific than H, as eat is a semantic generalization of devour.

Syntactic Subsumption – T and H express the same fact, but the situation de-scribed in T is more specific than the situation in H. The specificity of T is expressed through one or more syntactic operations. For example in the pair:

H: The cat eats the mouse.
T: The cat eats the mouse in the garden.

T contains a specializing prepositional phrase.

Direct Implication – H expresses a fact that is implied by a fact in T. For example:

H:The cat killed the mouse.
T: The cat devours the mouse.

H is implied by T, as it is supposed that killed is a precondition for devour. In (Da-gan & Glickman, 2004) syntactic subsumption roughly corresponds to the restrictive extension rule, while direct implication and semantic subsumption to the axiom rule.
In (Pazienza et al., 2005) despite the two types of subsumption entailment, direct implication underlies deeper semantic and discourse analysis. In most cases, as implication concerns two distinct facts in T and H, and as facts are usually expressed through verbs, it follows that the implication phenomenon is strictly tied to the relationship among the T and H verbs. In particular, it is interesting to notice the temporal relation between T and H verbs, as described in (Miller, 1995). The two verbs are said to be in temporal inclusion when the action of one verb is temporally included in the action of the other (e.g. snore − > sleep). Backward-presupposition stands when the H verb happens before the T verb (win entails play). In causation a stative verb in H necessarily follows a verb of change in T (e.g. give − > have). In this case, the temporal rela-tion is thus inverted with respect to backward-presupposition. Such considerations leave space to the application of temporal and verb analysis techniques both in the acquisition and recognition tasks.
Ultimately, we want to regard entailment by generality as a relation between utterances (that is, sentences in context), where the context is relevant to understand the meaning. Then, considering study in (Pazienza et al., 2005), we understand that the relation entailment by generality can be compared with one of three relations:

Semantic Subsumption;
Syntactic Subsumption;
Or a combination – Semantic Subsumption + Syntactic Subsumption;

For us, in the most common definition, Entailment by Generality can be defined as the entailment from specific sentence towards a more general sentence. (Dias et al., 2011; Pais et al., 2011).

Context of Textual Entailment

Natural languages allow to express the same meaning in many possible ways, making automatic un-derstanding particulary challenging. Almost all computational linguistics tasks such as IR, QA, IE, text summarization and MT have to cope with this phenomenon.
Within TE framework, a text T is said to entail a textual hypothesis H if the truth of H can be in-ferred from T. This means that most people would agree that the meaning of T implies that of H. Somewhat more formally, we say that T entails H when some representation of H can be “matched” with some (or part of a) representation of T, at some level of granularity and abstraction.
Dagan & Glickman (2004) define TE as a relationship between a coherent textual fragment T and a language expression, which is considered as a hypothesis H. Entailment holds (i. e. T → H) if the meaning of H can be inferred from the meaning of T, as interpreted by a typical language user. This relationship is directional and asymmetric, since the meaning of one expression may usually entail the other, while entailment in the other direction is less certain.
This definition of textual entailment captures quite broadly the reasoning about language variability needed by different applications aimed at natural language understanding and processing (Androut-sopoulos & Malakasiotis, 2010; Dagan et al., 2009). For instance, a QA system has to identify texts that entail the expected answer. Given the question “Who painted the Mona Lisa?”, the text “Among the works created by Leonardo da Vinci in the 16th century is the small portrait known as the Mona Lisa or la ‘Gioconda’”, entails the expected answer “Leonardo da Vinci painted the Mona Lisa”. Similarly, in IR relevant documents should entail the combination of semantic concepts and rela-tions denoted by the query. In IE, entailment holds between different text variants expressing the same target relation (Romano et al., 2006). In text summarization, an important processing stage is sentence extraction, which identifies the most important sentences of the texts to be summarized; especially when generating a single summary from several documents (Barzilay & McKeown, 2005), it is important to avoid selecting sentences that convey the same information as other sentences that have already been selected (i.e. that entail such sentences). Also in MT, an entailment relation should hold:

Context of Textual Entailment

among machine-generated translations and human-authored ones that may use different phras-ings in the evaluation phase (Padó et al., 2009b), or
in the translation phase, between source language words and longer phrases that have not been encountered in training corpora (Mirkin et al., 2009).

Other applications that could benefit from such inference model are reading comprehension systems (Nielsen et al., 2009).
Below, we give a few variants of informal definitions for textual entailment.

Dagan et al. (2005) – […] a text T entails a hypothesis H if, typically, a human reading T would infer that H is most likely true;
A definition of entailment in formal semantics (Chierchia & McConnell-Ginet, 2000) reads – A text T entails another text H if H is true in every circumstance (possible world) in which T is true.

Several definitions are given by the participants in various RTE challenges:

T entails H if we have a sequence of transformations applied to T such that we can obtain H with an overall cost below a certain threshold, empirically estimated on the training data (Kouylekov & Magnini, 2005);
If the BLEU’s output is higher than a threshold value the entailment is marked as TRUE, other-wise as FALSE (Pérez & Alfonseca, 2006);
T entails H if we succeed to extract a maximal subgraph of X DG_T that is in isomorphism relation with a subgraph X DG_H (Pazienza & Pennacchiotti, 2005);
In Guidelines of RTE-4¹ Challenge – T entails H if the truth of H can be inferred from T within the context induced by T.

Probabilistic Textual Entailment

In many intuitive cases, the textual entailment recognition task may be perceived as being deter-ministic (Glickman & Dagan, 2005). For example, given the hypothesis h₁ = “Harry was born in Iowa” and a candidate text t₁ that includes the sentence “Harry’s birthplace is Iowa”, it is clear that t₁ does (deterministically) entail h₁ , and humans are likely to have high agreement regarding this decision. In many other texts, though, entailment inference is uncertain and has a probabilistic nature. For example, a text t₂ that includes the sentence “Harry is returning to his Iowa hometown to get married.” does not deterministically entail the above h₁ since Harry might have moved to Iowa as a child. Yet, it is clear that t₂ does add substantial information about the correctness of h₁ . In other words, the probability that h₁ is indeed true given the text t₂ ought to be significantly higher than the prior probability of h₁ being true. More specifically, we might say that the probability p of h₁ being true should be estimated based on the percentage of cases in which someone’s reported hometown is indeed his/her birthplace. Accordingly, we wouldn’t accept t₂ as a definite assessment for the truth of h₁ . However, in the absence of other definite information, t₂ may partly satisfy our information need for an assessment of the probable truth of h₁ , with p providing a confidence probability for this inference.
Meanings are captured in Glickman & Dagan (2005) model by hypotheses and their truth values. Let T denotes a space of possible texts, and t ∈ T a specific text and let H denote the set of all possible hypotheses. A hypothesis h ∈ H is a propositional statement which can be assigned a truth value. For now it is assumed that h is represented as a textual statement, but in principle other representations for h may fit their framework as well. A semantic state of affairs is captured by a possible world w:
→ 0, 1, which is defined as a mapping from H to 0 = False, 1 = True, representing the set of w’s concrete truth value assignments for all possible propositions. Accordingly, W denotes the set of all possible worlds.
Glickman & Dagan (2005) present a first attempt to define a generative probabilistic setting for TE, which allows a clear formulation of probability spaces and concrete probabilistic models for this task. According to their definition, a text t probabilistically entails a hypothesis h (t → h) if t increases the likelihood of h being true, i.e. if P(Tr_h = 1|t) > P(Tr_h = 1), where Tr_h is the random variable whose value is the truth value assigned to h in a given world.
From this applied empirical perspective, textual entailment represents therefore an uncertain – but highly plausible – relation, that has a probabilistic nature.

READ Inter-organisational relationship theory

Recognizing Textual Entailment

The RTE task, as defined by Dagan et al. (2005), and established in the RTE Challenges, is formulated as follows:
Textual entailment is defined as a directional relationship between pairs of text expres-sions, denoted by T (the entailing “Text”) and H (the entailed “Hypothesis”). We say that T entails H if humans reading T would typically infer that H is most likely true.
As noted by Dagan et al. (2005), this definition is based on common human understanding of language, much like the definition of any other language understanding task. Accordingly, it enables the creation of gold-standard evaluation data sets for the task, where humans can judge whether the entailment relation holds for given Text-Hypothesis pairs. This setting is analogous to the creation of gold standards for other text understanding applications like QA and IE, where human annotators are asked to judge whether the target answer or relation can indeed be inferred from a candidate text. The distinguishing characteristic of the textual entailment task is that it captures textual inference in a generic, application-independent manner. This allows research to focus on core inference issues, while making the results applicable across application areas.
Similar to other semantic annotation tasks, such as those mentioned above, the RTE judgment criterion has some fuzziness with respect to “what a person would typically infer”, particularly in boundary cases. However, the various RTE annotation efforts have shown that sufficiently consistent human judgments can be obtained, allowing research progress on this task (Dagan et al., 2013).
Also, this task captures generically a broad range of inferences that are relevant for multiple applications. For example, QA system has to identify texts that entail the expected answer. Given the question “Who is John Lennon’s widow?” the text “Yoko Ono unveiled a bronze statue of her late husband, John Lennon, to complete the official renaming of England’s Liverpool Airport as Liverpool John Lennon Airport” entails the expected answer “Yoko Ono is John Lennon’s widow”. Similarly, semantic inference needs of other text-understanding applications such as IR, IE and MT evaluation can be cast as entailment recognition (Candela et al., 2006). A necessary step in transforming textual entailment from a theoretical idea into an active empirical research field was the introduction of benchmarks and an evaluation forum for entailment systems.

Our Proposal for RTE by Generality

We introduce the paradigm of TE by Generality, which can be defined as the entailment from a specific sentence towards a more general sentence. For example, from sentences (1) and (2) extracted from RTE-1, we would easily state that (1) → (2) as their meaning is roughly the same although sentence (2) is more general than sentence (1).

(1) Mexico City has a very bad pollution problem because the mountains around the city act as walls and block in dust and smog.
(2) Poor air circulation out of the mountain-walled Mexico City aggravates pollution.

To understand how Textual Entailment by Generality can be modeled for two sentences, we propose a new paradigm based on a new Informative Asymmetric Measure (IAM), called the Asymmetric In-foSimba Similarity (AIS) measure. Instead of relying on the exact matches of words between texts, we propose that one sentence infers the other one in terms of generality if two constraints hold:

if and only if both sentences share many related words and (b) if most of the words of a given sentence are more general than the words of the other sentence. As far as we know, we are the first to propose an unsupervised, language-independent, threshold free methodology in the context of TE by Generality, although the approach from Glickman & Dagan (2005) is based on similar assumptions. This new proposal is exhaustively evaluated against the first five RTE datasets by testing different Asymmetric Association Measures (AAM) in combination with the In particular, the RTE-1 as it is the only dataset for which there exist comparable results with linguistic-free methodologies (Bayer et al., 2005; Glickman & Dagan, 2005; Perez et al., 2005).
Finally, we propose to avoid the definition of a “hard” threshold and study exhaustively asymmetry in language i.e. not just by the conditional probability as done in Glickman & Dagan (2005). For that purpose, we propose a new IAM called the AIS combined with different Association Measures.v

Table of contents :

List of Figures
List of Tables
List of Abbreviations
1 Introduction
1.1 Natural Language Processing
1.1.1 Historical Review
1.1.2 Applications of Natural Language Processing
1.2 Motivation and Rationale
1.3 Linguistic Notion of Entailment
1.3.1 Variants of the Entailment
1.4 Context of Textual Entailment
1.4.1 Probabilistic Textual Entailment
1.4.2 Recognizing Textual Entailment
1.5 Our Proposal for RTE by Generality
1.6 Structure of the Thesis
2 Related Work
2.1 Overview of the First Five RTE Challenges
2.1.1 Datasets and Annotations
2.1.1.1 RTE-1
2.1.1.2 RTE-2
2.1.1.3 RTE-3
2.1.1.4 RTE-4
2.1.1.5 RTE-5
2.1.1.6 Summary
2.1.2 Relevant Resources and Tools
2.1.2.1 Evaluation Measures
2.1.2.2 First Challenge
2.1.2.3 Second Challenge
2.1.2.4 Third Challenge
2.1.2.5 Fourth Challenge
2.1.2.6 Fifth Challenge
2.1.2.7 Summary
2.2 Unsupervised Language-Independent Methodologies for RTE
3 Corpus construction
3.1 Crowdsourcing
3.2 Quality Control of Crowdsourced Data
3.3 Building Methodology
3.4 Quantitative Analysis
4 Our Methodology for RTE by Generality
4.1 Contextual Word Similarity
4.1.1 Applications of Word Similarity
4.1.2 Co-occurrence relations
4.1.2.1 Non-grammatical relations
4.1.3 Asymmetric Word Similarities
4.1.3.1 Asymmetric Association Measures
4.1.3.2 Asymmetric Attributional Word Similarities
4.2 Asymmetry between Words
4.3 Asymmetry between Sentences
4.4 Three Levels of Pre-Processing
4.4.1 Multiword Units Identification
4.5 Sample of Calculation for Identify Entailment by Generality
5 Evaluating the Performance of our Methodology
5.1 Evaluation Scheme
5.1.1 Measures to evaluate the performance
5.2 All pairs of the Test Set of the first five RTE Challenges
5.2.1 All Words
5.2.2 Without Stop Words
5.2.3 With Multiword Units
5.2.4 Summary
5.3 Corpus TE by Generality
5.3.1 All Words
5.3.2 Without Stop Words
5.3.3 With Multiword Units
5.3.4 Summary
5.4 Corpus TE by Generality translated into Portuguese
5.4.1 All Words
5.4.2 Without Stop Words
5.4.3 With Multiword Units
5.4.4 Summary
5.5 Qualitative Analysis
6 Conclusion and Future Work
6.1 Recapitulation
6.2 Future Research
References
Appendices