Strengthened interpretations in child language

Get Complete Project Material File(s) Now! »

A psycholinguistic study of the exhaustive readings of embedded questions


What is the semantic content of a question? As pointed out by Karttunen (1977), declarative sentences that embed interrogative complements (such as “John knows which students called”) can provide relatively easy access to the semantics of questions. Recent theories attribute different readings to such sentences and their predictions depend in various ways on the embedding verb (‘know’ in this example). Through a series of four experiments, we pro-vide quantitative offline data to evaluate critical judgments from the litera-ture. We show that the so-called strongly exhaustive reading is not the only available reading for ‘know’, providing an argument against approaches in-spired by Groenendijk and Stokhof (1982, 1984). We also describe processing data which may further constrain theories, provided hypotheses about the derivation processes are made explicit.


The meaning of questions

Our goal is to provide quantitative data to help decide between different semantics which have been proposed for questions.
We cannot study the meaning of questions, such as (1), as we traditionally study the meaning of declarative sentences, such as (2). Declarative sentences can be described by their truth conditions (Frege, 1892; Tarski, 1935, 1956). Knowing the meaning of (2) may be reduced to knowing which situations make this sentence true and which situations make it false. For questions, we cannot define truth conditions. It would not make any sense to say that (1) is true or false in a given situation.
Which students called?
Mary called.
Hamblin (1958, 1973) proposed that the meaning of a question resides in its answerhood conditions. Knowing the meaning of (1) is equivalent to knowing what counts as an answer to (1). Because answers are propositions, with truth condi-tions, questions can be studied within the general framework of truth-conditional semantics. The problem is then reduced to identifying the set of possible answers to a question. This tells us what to look for, but not how to look for it.

Embedding questions as a way to study them

Some verbs such as know can embed questions, as in (3), just as well as declarative sentences, as in (4).
John knows which students called.
John knows that Mary called.
As Karttunen (1977) pointed out, the meaning of (3) must be related to the meaning of the question (1) (and similarly (4) must be related to (2)). More specifically, (3) seems to be true in situations where John is able to answer the question (1). Because (3) is a declarative sentence, it can receive a truth value and we know how to study it. Thereby, sentences like (3) provide some access to the semantics of the question (1). As we will see, this provides an entry point into understanding questions both from a theoretical and from a psycholinguistic point of view.
Among all the predicates that can embed questions, we will focus on those like know, which are called veridical responsive predicates. As we saw, these verbs can embed both interrogative and declarative complements.1 They are interesting be-cause the meaning of (3) is usually assumed to be reducible to something of the form “John knows that p” for some proposition p (namely, an answer to the question). If this is the case, everything we know about the declarative-embedding know can be applied to the question-embedding know.2

Different readings for embedded questions

Let us first review the bare facts. Several readings have been associated with sen-tences such as (3) in the literature. We can distinguish between readings which are called exhaustive and those which are called non-exhaustive. In the first part of this section we will present the different exhaustive readings, which are the focus of our study. Their distribution has been strongly debated in the theoretical liter-ature, and our main goal will be to provide quantitative data that speak to their respective distributions. The second part of this section presents an example of a non-exhaustive reading. Non-exhaustive readings will not be given a prominent role here (although we will briefly discuss their potential role in our first experi-ment).
Exhaustive readings: The different exhaustive readings of (3) described in the literature are presented in (5). They are all called exhaustive readings because they require that John have complete knowledge about the students who actually called. They differ on the required knowledge about students who did not call. The Strongly Exhaustive (SE) reading described in (5a) requires complete knowl-edge about both callers and non-callers. It entails the Intermediate Exhaustive (IE) reading (5b), which merely requires absence of false beliefs about non-callers.3 The IE reading in turn entails the Weakly Exhaustive (WE) reading (5c), which does not require anything beyond ‘exhaustivity’, that is true beliefs about the students who actually called.
Exhaustive readings of (3):
Strongly exhaustive (SE):
For each student who called, John knows that she called, and he knows that no other student called.
Intermediate exhaustive (IE):
For each student who called, John knows that she called, and John does not have false beliefs about students who didn’t call.
Weakly exhaustive (WE):
For each student who called, John knows that she called.
The existence and co-existence of these readings is debated, judgments vary across theoreticians and certainly across verbs. Klinedinst and Rothschild (2011) present the only quantitative survey of the issue, which results suggest that differ-ent native speakers may provide responses coherent with any of the three exhaus-tive readings for questions embedded under the verb predict.
Non-exhaustive readings: Although they are not our focus, non-exhaustive read-ings have also been discussed. At least for know, they only seem to be available un-der specific circumstances. As an example, (6), repeated from B. R. George (2011), is usually considered true as soon as William is able to name at least one place where Rupert could buy an Italian newspaper. Any exhaustive reading seems too strong, since it would require that William know all the places where Italian newspapers can be purchased. This non-exhaustive reading is usually called the mention-some reading. It will not be at the foreground of our inquiry.
William knows where Rupert can buy an Italian newspaper.

Architecture of recent theories of questions and embedded ques-tions

We cannot offer a complete introduction to the theories of questions, but in this section we try to give a sense of what they are made of and how they may vary.
The main material at the basis of our experimental inquiries is to be found in the empirical predictions of the relevant theories. We introduce these empirical aspects independently of their formal and conceptual basis in the following section.
In this section, we focus on theories which follow the general architecture orig-inally defined by Hamblin (1973).4 These theories are based on three essential in-gredients: (a) a denotation of questions which corresponds to a set of answers/propositions,
Ā Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā㜀Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā ᜀ Ā some mechanism to combine such sets of propositions with verbs which nor-mally combine with propositions and possibly (c) some form(s) of strengthening mechanism.
Answer sets. A theory of questions first owes a basic denotation for a ques-tion. This is typically a set of answers. We can define two groups of theories, depending on which set of answers is taken as the denotation of the question.
On the first group of theories, following Hamblin (1973) and Karttunen (1977), the denotation of a question consists of so-called ‘weak answers’, which can be defined through an example:
Weak answers for a question such as Who called?: Mary called, Peter called, Mary and Peter called…
As a result of the combination of this denotation with the compositional mechanism (see (b) below), the WE reading is primitive. The other readings may then be derived by applying strengthening mechanisms (see (c) below).
Most recent theories fall into this group (Beck & Rullmann, 1999; Lahiri, 2002; Sharvit, 2002; Guerzoni, 2003, 2007; Guerzoni & Sharvit, 2007; Klinedinst & Rothschild, 2011; Spector & Egré, 2015).5
On the second group of theories, the denotation of a question consists of so-called ‘strong answers’, see (8), as in Groenendijk and Stokhof (1982, 1984, 1993) and more recently B. R. George (2011).
Strong answers for a question such as Who called?:
Only Mary called, Only Peter called, Only Mary and Peter called…
As a result, the SE reading is primitive. If we allow only strengthening mech-anisms, as opposed to weakening mechanisms, the other exhaustive readings are simply not available.6
4We do not discuss theories which strongly divert from the work of Hamblin (1973). Some ap-proaches, such as Ginzburg (1996), put less stringent constraints on the set of possible readings and focus their inquiry more on the role of context in the selection of a reading.
The inquisitive semantics framework has also given rise to much work on questions, partly in reaction to the limitations of partition semantics (see Mascarenhas, 2009), but we are not aware of any explicit implementation of question embedding in an inquisitive framework. Nevertheless, the main conclusion from our study applies to these theories as well: they must all be able to derive some form of the intermediate exhaustive reading.
Strictly speaking, Spector and Egré (2015) use a rule which quantifies over strong answers, so they should fall into the second group. However, they also refer to the weak answers in order to derive the IE reading. Because weak answers cannot be retrieved from the strong ones, it makes sense to consider their theory as a member of the first group.
6B. R. George (2011) makes use of the weak answers to derive mention-some readings (non-exhaustive) but his theory cannot derive WE readings. He argues that alleged cases of WE readings, such as the examples of Berman (1991) with the verb surprise, are in fact mention-some readings.
Combination with responsive verbs. This is usually the most technical part of a theory: it formalizes how compositional semantics combines the denotation of a question (set of answers, see (a) above) with the semantics of the verb with which it merges. A simple example would be the following rule for the verb know: “knowing Q is equivalent to knowing the conjunction of all true answers to Q”. So, if Q = {a1, a2, a3} and a1 and a2 are true but a3 is false, then Jknow QK = Jknow a1 ∧ a2K.
Actual theories usually provide a uniform treatment of all responsive verbs by means of a generic abstract rule. These rules are somewhat complex and there is room for variation. For instance Lahiri (2002) considers that the verb is lexically responsible for restricting (or not) the set of relevant answers, e.g., to true answers; Spector and Egré (2015) do not use universal quantification but rather existential quantification (roughly: Jknow QK = ∃a ∈ Q : Jknow aK), and they also treat the assertive and presuppositional dimensions independently.
Strengthening mechanisms. Various semantic and pragmatic mechanisms have been described for strengthening the meaning of an utterance.
One example is exhaustification, through a grammaticalized EXH-operator as in e.g., Fox (2007) or as a proxy for Gricean strengthening. For our pur-poses, the potential application of exhaustification can help derive stronger readings from weaker ones, even though weaker readings remain primitives.
Another example more specific to the theory of questions is given by Heim (1994), who showed that the set of strong answers could be recovered from the set of weak answers (but not vice versa). (Note that this role can also be supported by a more general EXH operator as described above, provided that it can be applied at the level of individual answers in the denotation of the question). Such a mechanism effectively makes the first group of theories as described in (a) above capable of encompassing the predictions of the second group of theories.
Let us present a semi-concrete theory that puts together all of these ingredi-ents. We could consider a theory which (a) makes use of weak answers, (b) relies on the ‘universal’ embedding rule and (c) allows for the EXH-operator to produce stronger readings (as a first approximation, this operator strengthens a proposi-tion by conjoining it with the negation of all its non-weaker alternatives). Such a theory would be very close to the approach of Klinedinst and Rothschild (2011) and would be able to derive all of the exhaustive readings that we discussed. First, WE readings would be primitives (using weak answers and no exhaustification yields a weak reading). Second, IE readings would be obtained by exhaustifica-tion at the matrix level (see Klinedinst and Rothschild, 2011 for details). Finally, SE readings would be obtained by local exhaustification, the reason being roughly that this would turn weak answers into strong answers, in the sense of (7) and (8), as suggested below:
EXH(‘Peter and Mary called’, alternatives={‘Sue called’, ‘Jack called’, …}) = JPeter and Mary calledK ∧ ¬JSue calledK ∧ ¬JJack calledK ∧ …
The combinations of various versions of the three ingredients we presented in this section lead to a variety of possible theories. In the next section, we will focus on the empirical predictions of the current implementations found in the literature. In short, starting with weak readings allows for more flexibility, because various tools can be called for to obtain strengthened meanings from weaker ones (e.g., exhaustification, maximization) and less so in the other direction.7 Although we will not enter into the details of which theories rely on which tools, we will describe the predictions they make.

Predictions of recent theories for know

Let us put aside the technical differences between the different theories and the way they derive the different readings. Here we merely want to provide the set of readings each theory predicts to be available, so that we can later evaluate whether these predictions are borne out. Let us also focus on the verb know, which has been discussed in greater detail than any other responsive verb. We thus want to sum-marize which readings are available for know, according to some recent theories from the two groups identified in 1.4(a). This is done in Table 1.1, which indicates whether a given theory can derive a given reading or not.
Let us note that theories also vary as to whether they offer different predictions for different verbs. The comparison between different verbs is in fact crucial to put constraints on the available compositional mechanisms between the verb and the embedded question (see 1.4b). However, this cross-verb variation will not be the focus of our inquiry, which focusses on know, and we therefore do not report a complete set of predictions across verbs. Let us immediately note however that in Experiment 2 we also tested the verb predict, and that the results did not reveal important differences with know.

READ  Frameworks and models within digitisation and sustainability

Goals of our study

The goal of our study will be to provide constraints on what readings an accurate theory should derive and how. We will test the availability of the different exhaus-tive readings and compare these results with the predictions in Table 1.1. We will also gather online data which may yield extra constraints on possible strengthen-ing mechanisms (for theories à la Heim, 1994) when hypotheses about the cognitive processes are made explicit.
In Experiment 1, we show that know can give rise not only to SE readings but also to weaker readings. In Experiment 2, using a different paradigm, we show that these weaker readings consist mostly of IE readings (and in fact both for know and predict). The results of Experiment 3 confirm that we cannot reliably establish the availability of the WE reading for know. Finally, in Experiment 4 we compare the processing properties of the IE and SE readings.

Experiment 1: Existence of WE/IE readings under know


Most theories derive various readings for sentences containing questions embed-ded under know (see Table 1.1). But most theorists report judgments that these configurations only give rise to SE readings. The goal of this experiment was to test this empirical claim. We did not aim to distinguish between WE and IE read-ings for the time being.
We also collected online data in order to test a prediction derived from Heim (1994). Her theory derives the SE reading from the WE reading. Hence, the SE reading requires an extra derivation step and we may thus be able to detect an extra processing cost.

Methods and Materials

The experiment consisted of a truth-value judgment task: Participants read pairs of sentences (which we call context and utterance, respectively) and had to judge whether the utterance was true or false in the given context. All experiments were conducted using American English.
Participants were directed from Amazon Mechanical Turk to an online experiment hosted on Alex Drummond’s Ibex Farm. After filling out a consent form they re-ceived instructions (see details below), then completed a short training phase (8 items with feedback) followed by the experimental phase. After completing the experiment, participants filled in a questionnaire (including demographic ques-tions about participants’ age, gender and native language). The final step was to validate their participation on Mechanical Turk.
The instructions provided a story repeated in Fig. 1.1. The goal of this story was to provide a general context for the experiment such that we could easily manipulate what the truth-value of the various hypothesized readings would be. In a nutshell, it was set up so that various characters (e.g., the male baker) would have total (exhaustive) knowledge about some events (e.g., men buying bread) and partial or no knowledge about other events.
A small town in a remote area has two shopping malls: one for women and one for men.
Men shop at the men’s mall, where all shopkeepers are men.
Women shop at the women’s mall, where all shopkeepers are women.
In each mall there is one of each facility: there is one male and one female baker, one male and one female mechanic, and so on. Sometimes the shopping malls close.
Men do not always know whether the women mall is open or not.
Women do not always know whether the men mall is open or not.
Each shopkeeper knows what he/she sells to whom, but knows nothing about what happens in other shops.
The instructions also included 4 example items, which are given in (10). (10a,c) are true and (10b,d) are false given the story in Figure 1.1. This was explained to the participants.9
a. Situation: Today, the men shopping mall was open. “John-the-baker knows whether Bill bought bread.”
Situation: Today, both shopping malls were open. “Sandra-the-pharmacist knows who bought medicine.”
Situation: Today the men shopping mall was closed and everybody, including the women, knows about this.
“Lisa-the-tailor knows who bought a coat.”
Situation: Today, both shopping malls were open. “Henry-the-hairdresser knows if Peter got an oil change.”
The stimuli were built from 3 types of contexts, as exemplified in (11), and from 3 types of utterances, as exemplified in (12).
a. Today both shopping malls were open.
Today the women shopping mall was closed. Everybody knows about this.
Today the women shopping mall was closed. Men do not know about this.
a. John the baker knows who bought bread.
John the baker knows which men bought bread.
John the baker knows which men bought meat.
There was a recurring typo in our materials: in the second part of the instructions and in context descriptions (not in target sentences) we dropped the genitive marker in (wo)men’s shopping mall. Thanks to an anonymous reviewer for catching this mistake.
The crucial utterance was of the form of (12a). The truth value of its different possible readings varies with context as follows (see Table 1.2 for a summary):
In context (11a), (12a) is false under any exhaustive reading because John does not know which women bought bread (assuming some women did, as soon as their mall is open). The items obtained from the combination of this utterance and this context were false controls.
In context (11b), (12a) is true under all readings: For each man, John can tell whether he bought bread or not, and he knows that no woman bought bread that day. Therefore for each person John can tell whether he or she bought bread. The items obtained from the combination of this utterance and this context were true controls.
In context (11c), (12a) is true under the WE reading because only men bought bread, therefore for each person who bought bread it is true that John knows that this person bought bread. It should also be true under the IE reading, because John has no reason to falsely believe that a given woman bought bread. However, it is false under the SE reading, because John does not know that no woman bought bread. The items obtained from the combination of this utterance and this context formed our targets. If only the SE reading is available, we shouldn’t expect differences between (12a) in contexts (11a) and (11c): both items would be false.
We created variations of each item by using the 16 characters presented in Table 1.3 (8 jobs × 2 genders) and varying the words ‘men/women’ accordingly in the contexts and utterances. All 16 possible targets, along with true and false controls, were included. 24 true and 24 false fillers for which the truth value did not depend on the context were also created using the utterances (12b) and (12c), respectively. They ensured that participants could not guess the truth value of an item using only the context.
As for the display, context sentences were displayed for 3s and utterances were displayed word-by-word (250ms per word). Each word replaced the previous one, and there was no delay between them to avoid flicker effects. The last word re-mained on the screen until a response was given.
40 participants were recruited on Mechanical Turk. 39 of them completed the task (17 females, 22 males). Their age ranged from 19 to 66 years (mean: 33). All of them reported English as their native language.


Data treatment and Statistical methods
Responses made in less than 100ms or more than 10s were discarded (0.7% of the data). All mixed models were built with a maximal random effect structure based on participants and items as random variables (in the sense of Barr, Levy, Scheep-ers, and Tily, 2013). Our item factor was a value between 1 and 8 corresponding to one line in Table 1.3. This means that two items involving the same shop (e.g., ‘baker’) were not considered independent measures. For each model we give the estimate of the fixed effects (β) and a p-value with its associated statistics. For linear mixed models we used the t-statistics given in lme4 and for GLMM the χ2-statistics with one degree of freedom obtained by comparing the models with and without a given fixed effect.
Analysis of responses
Fig. 1.2a shows the proportions of True responses to targets, true controls, and false controls. Overall, the task was well-understood and executed as reflected by high accuracy on true (88%) and false (73%) controls.
On targets, participants gave on average 43% True responses and individual rates varied from 0% to 100% (while there was little between-item variation: 36% to 48%). Two logit mixed models were fitted to compare the responses to targets with true or false controls respectively. The results showed that the differences were significant between targets and both true controls (β = 2.7, χ2(1) = 32, p < .001) and false controls (β = −1.4, χ2(1) = 8.4, p = .003).
The fact that participants gave fewer True responses on targets than true con-trols suggests that some participants had a SE reading and treated the utterance as false on the target trials. More surprising is the fact that they treated targets differently from false controls. According to Table 1.2, this high rate of true re-sponses reveals the presence of a reading other than the SE reading. The fact that the intermediate rate of true responses corresponded to individual rates from 0% to 100% confirms that participants differed in the way they understood the same sentence. If they were simply at chance because the target sentences were too com-plicated we might have observed a more homogeneous pattern around 50% across the participants; alternatively, some participants would have been at one end of the spectrum (those that managed to do the task) and others would have been at 50%.10

Table of contents :

A very short history of questions in formal semantics
And now?
Goal for this dissertation
I Psycholinguistic investigations 
1 A psycholinguistic study of the exhaustive readings of embedded questions
1.1 Introduction
1.1.1 The meaning of questions
1.1.2 Embedding questions as a way to study them
1.1.3 Different readings for embedded questions
1.1.4 Architecture of recent theories of questions and embedded questions
1.1.5 Predictions of recent theories for know
1.1.6 Goals of our study
1.2 Experiment 1: Existence of WE/IE readings under know
1.2.1 Goal
1.2.2 Methods and Materials
1.2.3 Results
Data treatment and Statistical methods
Analysis of responses
Analysis of response times (no clear result)
1.2.4 Discussion
1.3 Experiment 2 : non-SE readings for know and predict
1.3.1 Goals
1.3.2 Methods and Materials
1.3.3 Stimuli
Item generation
1.3.4 Results
Data treatment and Statistical methods
Analysis of responses
Analysis of response times
1.3.5 Discussion
1.4 Experiment 3: WE reading or domain restriction?
1.4.1 Goal
1.4.2 Methods and Materials
1.4.3 Results
Data treatment and Statistical methods
Analysis of responses
Comparison with Experiment
1.4.4 Discussion
1.5 Experiment 4: IE vs. SE readings of know
1.5.1 Goal
1.5.2 Methods and Materials
Course of the experiment
1.5.3 Results
Data treatment and statistical methods
Analysis of responses
Response times
1.5.4 Discussion
1.6 General discussion
1.6.1 Summary of the results
1.6.2 Consequences for the theories
2 Children’s exhaustive readings of questions 
2.1 Introduction
2.1.1 Strengthened interpretations in child language
2.1.2 The different interpretations of embedded questions
Theoretical background
A concrete implementation
2.2 Experiment
2.2.1 Method
2.2.2 Results
Control conditions
Test conditions
Follow-up justifications
2.3 Discussion
2.4 Conclusion
3 Experiments on the acceptability and possible readings of questions embedded under emotive-factives 
3.1 Emotive-factive predicates and Questions
3.1.1 Two puzzles regarding questions and emotive-factives
Puzzle 1: Whether-questions
Puzzle 2: Exhaustive readings
3.1.2 Monotonicity as a key to Puzzles 1 and 2?
3.1.3 Summary
3.2 Experiment 1: Selectional properties of different attitude predicates
3.2.1 Goals
Selectional properties
Degrees of unacceptability
Quantificational variability
3.2.2 Methods
Task and Instructions
Design and Stimuli
Statistical methods
3.2.3 Results
Predicate categories
Specific questions
3.2.4 Discussion
3.2.5 Conclusions for Experiment 1
3.3 Experiment 2: On the monotonicity of responsive predicates
3.3.1 Goal
3.3.2 Methods
Task and Instructions
Design and Stimuli
Statistical methods
3.3.3 Results
Control items
Attitude predicates
3.3.4 Discussion
3.4 Experiment 3: Strongly exhaustive readings
3.4.1 Motivations and additional background
3.4.2 Methods
Instructions and training phase
Design and Stimuli
Analytical and statistical methods
3.4.3 Results
3.4.4 Discussion
SE readings for all predicates
SE readings with surprise, alternative interpretations?
3.4.5 Conclusion/summary for Experiment 3
3.5 Conclusion
3.5.1 Summary of the results
3.5.2 Conclusions
A Lists for Experiment 1
B Lists for Experiment 2
II Theoretical contribution 
4 Plurality effects and exhaustive readings of embedded questions
4.1 Plurality effects with definite descriptions and questions
4.2 Embedded questions as definite plurals in the literature
4.2.1 Plurals and definite descriptions
4.2.2 Some previous theories of plurality in questions
4.2.3 Answer operators
4.3 A possible implementation
4.3.1 Hypotheses
4.3.2 Application
Quantificational Variability
Cumulative readings
Alternative theories
4.4 Incorporating stronger exhaustivity
4.4.1 Different exhaustive readings of embedded questions
4.4.2 Klinedinst and Rothschild (2011)
4.5 An exhaustification theory for plural questions
4.5.1 Hypotheses
Generating alternatives answers
Exhaustivity operator
Extra assumptions on specific lexical items
4.5.2 First application
WE reading
SE reading
IE reading
4.5.3 False answer-sensitive readings beyond know
4.6 Application to new cases
4.6.1 QVE sentences
4.6.2 Cumulative readings
4.6.3 Mention-some questions
4.6.4 Forget
4.6.5 Multiple wh-questions
4.6.6 Primary implicatures
4.6.7 A few unwelcome predictions
Quantificational subjects
Negative sentences
4.7 Conclusion
5 Homogeneity and Quantificational variability with Embedded Questions
5.1 Homogeneity effects
5.1.1 Plural definite descriptions and homogeneity effects
5.1.2 Homogeneity with embedded questions?
5.1.3 Summary
5.2 Quantificational variability effects and questions as pluralities
5.2.1 What is Quantificational variability?
5.2.2 Lahiri (2002)
5.2.3 Beck and Sharvit (2002)
5.3 A unified theory of plural embedded questions
5.3.1 Ingredients of the theory
5.3.2 Application to simple cases
Simple affirmative sentence with know
Simple negative sentence with know
Sentence with an adverb of quantity with know
5.3.3 New predictions and puzzles
Rogative verbs
Other types of questions
5.4 More on lexical restrictors
5.4.1 On the necessity of lexical restrictors
5.4.2 Complex factive verbs
5.4.3 Communication verbs
5.4.4 Other non-veridical predicates
5.4.5 Believe and embedded questions
5.5 Conclusion


Related Posts