Statistical and algorithmic aspects of structured output prediction applied to opinion structures

Get Complete Project Material File(s) Now! »

Linguistic andmultimodal models of appraisal in english

The cognitive models of emotion do not necessarily provide tools for formalizing opinion expressions in language. The work of MARTIN and WHITE [2003] provides the linguistic tools to characterize the expression of appraisals based on an extensive analysis of the linguistic phenomena at hand. Their structural description of appraisal as defined in systemic functional linguistic, can be empirically decomposed over observable linguistic patterns and relies on a three level decomposition: on top the Attitude defines broadly the valence of the evaluation, the Gradation corresponds to the intensity of the underlying Attitude and the Engagement describes the level of involvement of the speaker in the evaluation expressed. The notion of Attitude is deeply linked to the Liking / Disliking emotion pair defined in the work of ROSEMAN [1984] since they both describe the valence of an emotional response provoked by the evaluation of an object. MARTIN and WHITE [2003] add a level of description and define 3 types of attitudes.
• Affects cover polarized expressions oriented towards the speaker herself. They can potentially focus on external objects but their functionality is the description of an inner state without any judgement.
• Judgements correspond to ethics and rule based evaluations. They results from the application of a moral principle (corresponding itself to a socio-cultural legacy) and do not reflect a personal reaction but rather a social normdictated behavior.
• Appreciations cover the rest of the evaluations and attach themselves to the target of the evaluation contrarily to affects that describes the inner state of a human subject.

Anatural language processing formal model of opinions

Whereas the linguistic structures previously described resulted from years of study from psychology and linguistics, the recent availability of big amounts of opinionated data from the web created the need for computationally efficient models. Computational models relying on the appraisal theory such as NEVIAROUSKAYA and collab. [2010] are indeed based upon handcrafted rules provided by the theory and cannot be easily applied to large vocabulary sizes or spontaneous language containing disfluencies. The domain of sentiment analysis and opinion mining emerged from the work of practitioners whose aimwas to build accurate sentiment analyzer able to work with crawled data. Of course such data is noisy and does not provide the deep level of control of psychological studies where the participants were carefully selected. The need for simpler andmore practical sentiment models gave rise to a Natural Language Processing literature with new opinion definitions that raised new machine learning based models. Themost famous framework detailed by LIU [2012] is a founding stone to understand modern works in sentiment analysis. We recall the most important definitions and key problems below. The first step is to provide a set of definitions that define an opinion mathematically, and how it can be linked with empirical data.

Towards a general computationalmodel of opinions

Following the previous definitions, we propose to represent the opinion expressed by an opinion holder as a hierarchical tree where each non-leaf node represents a part of its parent and the leaf nodes represents the valence of the underlying opinion. The Figure 2.1 displays an example of such a structure. The entity-aspect-subaspect decomposition of LIU [2012] is represented by the green nodes that represent more fine objects as the color gets darker. For each aspect mentioned in an opinion, the corresponding valence expressed is represented by a blue node. Fromamathematical point of view, the aspect structure can be represented by a binary hierarchical tree i.e. a set of binary labeled nodes such that a node can be labeled as 1 if its (unique) parent is also labeled one. This relation ensures the part-of relation implying that an opinion on a part of an object also acts on the object itself. Under this representation, a non 0 valence value indicates the valence of an opinion expressed on its ascendants and in particular its direct parent. The choice of the structure comes from modeling choices: We can choose it arbitrarily deep and thus try to cover all the subparts of an entity but this complexity will make the data collection process difficult. In practice the structure presented above remains general and this thesis explores the problem of predicting this type of labeled graph under some additional hypothesis. We studied different specifications of this structure:
1. In Chapter 3, we present the problem of learning preference functions. Such objects can be used to treat the problem of continuous valued valences with a fixed set of aspects. This model is justified by the difficulty of gathering reliable continuous valence labels. Indeed asking reviewers for continuous ratings does not allow for retrieving the notion of preference over a set of objects based on the obtained labels. Previous studies have shown that asking directly for preferences leads to better agreements [YANNAKAKIS and HALLAM, 2011; YANNAKAKIS and MARTINEZ, 2015; YANNAKAKIS and MARTÍNEZ, 2015].
2. In Chapter 4 we explore the case of categorical valued valences andmodel the structure above as a fully binary hierarchical graph. In this setting, we present the statistical and computational properties of the resulting predictors and study the question of building an abstention mechanismi.e. a way to abstain from predicting the difficult parts of the graphs. granularity levels: Instead of only predicting a structure at a fixed granularity level, we predict multiple structures at different granularities while taking advantage of the relations that link them. The intuition is that if an opinion is found in a sentence, the representation predicted at the review level should be dependent of this prediction.

Table of contents :

Contents
List of figures
List of tables
1 Introduction
1.1 Research questions
1.2 Contributions and detailed thesis organisation
1.3 References
I Definitions and framework
2 Models of opinion-related phenomena
2.1 Appraisal theory
2.2 A natural language processing formalmodel of opinions
2.3 Practical aspects of structured opinion prediction
2.4 Towards a general computational model of opinions
2.5 References
3 Structured output prediction
3.1 SupervisedMachine learning setting
3.2 Structured Output Prediction
3.3 GraphicalModel based Structured output prediction
3.4 Output kernel regression
3.5 The problem of building consistent predictors
3.6 References
II Statistical and algorithmic aspects of structured output prediction applied to opinion structures
4 A unifying regression based framework for preference function prediction
4.1 Fromcontinuous valence opinion models to preference functions
4.2 General background on label ranking
4.3 Preliminaries
4.4 Structured prediction for label ranking
4.5 Output embeddings for rankings
4.6 Statistical analysis of the regression label ranking based predictors
4.7 Numerical Experiments
4.8 References
5 StructuredOutput Learning withAbstention: Application toAccurateOpinion Prediction
5.1 Motivation
5.2 Structured Output Labeling with Abstention
5.3 Geometric interpretation of prediction with abstention
5.4 Estimation of the conditional density Ey|x√wa(y) from training data
5.5 Learning guarantee for structured losses with abstention
5.6 Pre-image for hierarchical structures with Abstention
5.7 Numerical Experiments
5.8 References
III A multimodal deep learning approach for hierarchical opinion prediction
6 A multimodal movie review corpus for fine-grained opinionmining
6.1 Introduction
6.2 Related work
6.3 The video opinionmovie corpus
6.4 Annotation
6.5 Annotation challenges and guidelines
6.6 Validation of the annotation
6.7 References
7 Fromthe Token to theReview: AHierarchicalMultimodal approach toOpinionMining
7.1 Introduction
7.2 Data description and model
7.3 Learning strategies for multitask objectives
7.4 Architecture
7.5 Experiments
7.6 References
8 Conclusion and future work
8.1 Contributions
8.2 Perpectives
8.3 References
A Annexes
A.1 Proofs and additional experimental results and details for chapter
A.2 Proofs and technical derivations for chapter
A.3 Additional experiments: Hierarchical classification fMRI images
A.4 Annotation guide for the POM Dataset
A.5 References