Goals of Recommendation Systems
From a business point of view, the primary goal of a RS is of course to increase the profit of the company, through a higher number of sales. To reach this objective, the system has to be able to meet some requirements:
• Know what the user wants / Relevance: quite obviously, the RS has to suggest items that are relevant to the users’ tastes, because users are more likely to buy or consume items they have interest in. However, solving this task alone is not sufficient to satisfy a user, as it is ex-plained in the next point.
• Diversity, Novelty and Serendipity: firstly, if the RS always suggests items of the same sort to a user, there is a risk that this user would get bored or would not like any item. Thus, the system has to pay at-tention to put items of different types, bringing diversity. Advertising more diverse items also helps the system to gather feedback about a wider set of products, and can be beneficial for future recommenda-tion.
Secondly, even if it can be relevant, the recommended item also has to be something the user has not already bought or experienced in the past: some novelty is required. Lastly, the serendipity implies to surprise the user by recommend-ing items he does not expect. Compared to novelty, the suggested item would belong to a category the user did not expect at all. This can sometimes lead the user to widen its area of interest, and help to increase sales diversity. Serendipity is tightly related to the explo-ration/exploitation dilemma addressed in Chapter 3.
• User satisfaction and fidelity: finally, another goal of the RS is to in-crease the user satisfaction and fidelity. A good user interface and accurate recommendations might encourage the user to connect and use the site again. Giving explanations about why a specific recom-mendation has been provided is also usually appreciated by users. User fidelity to a RS implies more feedback is received from this user, leading to more refined knowledge of his tastes and consequently to better recommendations for this particular user. These key goals are common to all RS in any application case, and they need to be integrated carefully during the implementation of the system. The range of applications where recommendations can be done is wide and diverse (Schafer et al., 2001). Here are the main fields in which recommen-dation are applied nowadays:
1. E-commerce: the system is recommending to consumers some prod-ucts they are likely to buy, like CDs or books (Linden et al., 2003).
2. Entertainment: the system recommends to the user items like movies (Miller et al., 2003) , music. The profit is usually made from adver-tising or subscription to the website.
3. Services: for example, the recommendation of apartments to rent, of doctors, of travel packages. The service usually makes money by tak-ing some fees out of the payment.
4. Social: recommendation of possible friends or online dating are the two main uses of social recommendation. The profit is usually made through advertising or subscription.
5. Content: examples are online news (Das et al., 2007), recommendation of web pages, tweets (Chen et al., 2012), or display of ads. The Table 2.1 displays a list of examples, with famous companies where RS are used and the categories of items they recommend.
Approaches to Recommendation
A large range of methods have been proposed to make accurate recommen-dations. We give in this section a brief overview of the main categories of algorithms, except the Collaborative Filtering (CF) methods using Matrix Factorization. A specific section will be dedicated to these methods, as they are the most successful and are also used in our approaches in the following chapters. As this is only a short overview over the classical approaches, the reader interested in more details can look into (Adomavicius and Tuzhilin, 2005; Shapira et al., 2011).
Content-Based Filtering methods use external information about the items, such as keywords, tags, or profile written as texts in actual language to perform recommendation (Lops et al., 2011). By using item features, the system can provide recommendations of non-experienced items that have a similar thematic to those he liked in the past. For example, if a user gives a positive feedback about a musical artist where the description contains the tags « rock » and « 90s », the RS is able to recommend other artists with the exact same tags (or similar ones), because the user will be likely to have an interest in them too. Content-based approaches make an analysis of the set of items for which the user expressed a preference in the past. Based on the description of these items, these approaches can build a profile of the user’s interests. Then, this user profile is matched against other items’ description to decide which item to recommend. There are several advantages using these approaches:
• They only make use of the ratings of the target user to perform rec-ommendation, they do not need ratings from other users.
• A new item in the RS can be recommended based on its features, even if it has not been rated by any user. This solves the issue called item cold-start (cf. Section 2.4).
• It is easy to provide an explanation to the user about why this item has been recommended, since the content about the items like keywords or textual profile is accessible to the system. However, Content-Based recommendation techniques also come with im-portant drawbacks:
• These methods cannot perform good recommendation for new users who have perform little to no recommendation, as they need to have information about which items the user likes before the first recom-mendation. This is referred as user cold-start.
• The serendipity goal cannot be reached with such approaches, as they tend to over-specialize over a set of items with similar content for a given user.
• Finally, the biggest issue is that a good knowledge over the domain of items is required to create the content used in the recommendation process. Sometimes, only a partial knowledge can be captured by the content, or even worse, it is also possible that no information is available at all.
This type of systems assumes the possibility of partitioning the set of users based on their demographic profile (Krulwich, 1997; Pazzani, 1999). The demographic features such as the country or age of each user will decide to which class he belongs to. Then, a set of rules decides which recommenda-tion to perform depending on the class to which the user belongs. This type of approaches is similar to classification or regression tech-niques in which the input features are the demographic features and the output is the user’s preferences. They usually do not perform very well as users’ tastes cannot be inferred solely from their demographic features, but they can add some predictive power if combined with other methods, in hybrid methods for example (cf. Section 2.4.1).
A Knowledge-Based RS (Trewin, 2000) acts in a specific context, where the user is requesting a specific content he wants. This adds a constraint to which recommendation has to be provided. These methods are particu-larly used in applications where the interaction with the user is rare, like RS about cars or travel packages, since in most cases the user will only use the Recommendation System one to very few times. They aim at solving the lack of ratings given by the user by asking him to give some constraints to restrict the set of possible recommendations. Methods used for Knowledge-Based recommendation are close to the ones used in Content-Based recommendation, with the main difference be-ing the previous feedback or interactions of the user would not be useful as in the Content-Based case. The system will rather find similarity between the user’s current request or needs, and the description of the items.
Neighborhood-Based Collaborative Filtering
These approaches are part of Collaborative Filtering (CF) algorithms, but are often referred as « Memory-based » methods, as opposed to the « Model-based » methods among which lies Matrix Factorization techniques. The basic idea behind CF, whether the approach is Memory-based or Model-based, is that users who adopted the same behavior in the past will also tend to agree in the future. The term « Collaborative » is used here to signify that the model will discover underlying relations between users and items, and use it to provide personalized recommendations for each user. CF methods usually only rely on past interactions and feedback (either implicit or explicit) to build their model, which explains their popularity in the RS field. Compared to previous methods like Demographic or Content-Based Filtering, no additional information or knowledge about items is needed to perform recommendation. Another advantage is that the effec-tiveness of the RS also increases as the user provides more feedback. Neighborhood-Based Approaches were among the first approaches at the beginning of the research on RS. They are usually defined in two ways (Desrosiers and Karypis, 2011):
• User-based CF The assumption here is that similar users share similar interests, and as a consequence, the rating for a user u on an item i which has not been experienced yet can be estimated from the ratings of users similar to u. These similar users are called « neighbors ». To discover neighbors, a similarity function is used between the active user and others (e.g. the rows of the rating matrix).
• Item-based CF (Deshpande and Karypis, 2004) In a similar way, one can estimate the rating for a user u on the item i from the ratings given by the user on items similar to i. In this case, the « neighbors » are the items similar to i. They are found based on the similarity score computed between the current item and other items rated by the user (e.g. the columns of the rating matrix).
Table of contents :
1.2 Overview of the thesis
2 Recommendation Systems
2.1 Definition: The Recommendation Problem
2.1.2 Representation and Solutions
2.1.3 Goals of Recommendation Systems
2.2 Approaches to Recommendation
2.2.1 Content-Based Filtering
2.2.2 Demographic Filtering
2.2.3 Knowledge-Based Recommendation
2.2.4 Neighborhood-Based Collaborative Filtering
2.3 Collaborative Filtering: Matrix Factorization Techniques
2.3.1 Matrix Factorization and Low-Rank Approximation
2.3.2 Matrix Factorization Methods
2.4 Cold Start Issue
2.4.1 Hybrid Recommendation Systems
2.4.2 Active Learning
2.5 Incorporate Contextual Information
2.6 Evaluation of Recommendation Systems
2.6.2 Evaluation Scenarios
2.7 Learning to Rank
3 Recommendation as a Multi-Armed Bandit
3.1 Sequential Recommendation
3.1.1 Formulation of the Sequential Recommendation Scenario
3.3 Multi-Armed Bandits
3.4 Explore-exploit Recommendation System
3.5 Experimental Investigation
3.5.1 Experimental Setting and Remarks Datasets
3.5.3 Impact of Exploration
3.5.4 Impact of the Update Strategy
3.6 Concluding Remarks
4 Ranking Using (No-)Click Implicit Feedback
4.1 Sequential Recommendation of Multiple Items
4.2.1 Online Ranking in Information Retrieval
4.2.2 Recommendation with Ranking Approaches
4.2.3 Mixing Explicit and Implicit Data
4.3 Ranking Recommender System Using Click Feedback
4.3.2 Feature Engineering
4.3.3 Dual Matrix Factorization
4.4 Experimental Investigation with ERR Click Model
4.4.1 Evaluation Metrics
4.4.4 Results and Discussion
4.5 Experimentation with Other Click Models
4.5.1 The navigational click model
4.5.2 The informational click model
4.5.3 The almost random click model
4.6 Concluding Remarks
5 About Challenges in Real Recommendation Systems
5.1 Some Aspects of Real-world RS
5.1.1 Power-law Distributions
5.1.2 « Replay » Aspect
5.1.3 Large Set of Possible Recommendations
5.1.4 Stock Availability
5.1.5 Past and New Users/Items
5.1.6 The Influence of the Recommendation System
5.2 Realistic Offline Sequential Recommendation
5.2.2 Results and Discussion
Impact of Exploration
Update of the Model
5.2.3 Final Remarks
5.3 Some Lessons from a Real Case RS Challenge
5.3.1 RecSys Challenge 2014: Data and Protocol
5.3.3 Experiments and Discussion
5.4 Concluding Remarks
6.1 Thesis Contributions
A About UCB1 and Popular baselines