This weeks Featured Blog Friday comes from our Data Scientist, Agnes Jóhannsdóttir. If you have any questions or comments regarding this blog, feel free to leave a comment below or send us a tweet @AGRDynamics.
Nowadays, recommender systems are used to personalise your experience on the web, telling you what to buy, whom you should be friends with or what to listen to. People’s preferences are diverse, but they usually follow patterns. People tend to like things that are similar to other things they like, or have a similar taste as other people whose behaviour closely resembles theirs. Recommender systems try to capture these patterns to predict what else people might like. We encounter recommended systems every day, for example 35 percent of what we purchase on Amazon and 75 percent of what we watch on Netflix come from product recommendations based on machine learning algorithms.
Today we will dig deeper into different kinds of recommender systems, which is a type of machine learning. To get better understanding on what machine learning is, you can read our previous Featured Blog Friday here.
Implicit vs. Explicit data
Recommender systems rely on several types of input. Explicit feedback is when users explicitly express an opinion about the item, for example star ratings. Explicit feedback generally provides a high-quality signal, but gathering a large amount of explicit feedback is difficult and time-consuming. Therefore, using implicit feedback as the input to recommender systems is gaining more popularity, where applicable. Using implicit data, recommenders can infer user preferences by synthesizing several instances of user behaviour. Examples of implicit feedback include transaction history, browsing history, scrolls, search patterns, view times and mouse movements. As no additional action is required from users, collecting implicit feedback is faster and often cheaper. Most implicit feedback comes in the form of positive-only data, such as whether the user bought or clicked on an item.
Two of the most ubiquitous types of recommender systems are Content-Based and Collaborative Filtering (CF). Collaborative filtering produces recommendations based on the knowledge of users’ attitude to items, that is, it uses the “wisdom of the crowd” to recommend items. In contrast, content-based recommender systems focus on the attributes of the items and give you recommendations based on the similarity between them. Models that use both user behaviour and content features are called Hybrid Recommender Systems where often collaborative filtering and content-based models are combined. Hybrid recommender systems usually give higher accuracy than collaborative filtering or content-based models on their own, since it assembles the best features from both of the models.
Content-based recommender system
The focus of Content-based recommender systems is on the characteristics of the items or users in order to recommend new items with similar properties. As an example, a product profile could include a description, price, colour, brand, etc. Usually user and item profiles include text data which need to be converted into features by using feature extraction techniques. As the metadata is known in advance, recommendations are also available for new users/items where no user behaviour data has been collected. One major downside of content-based recommender system is that it does not take into account the behaviour of other users.
Collaborative filtering uses a user’s past behaviour to recommend new items. Collaborative filtering analyses the relationships between users and inter-dependencies among items in order to recommend a new item. The only required information is the previous interaction history (f.ex sales history). The algorithm has the very interesting property of being able to do feature learning on its own and it is domain free which means it doesn’t matter the nature and characteristics of the items you are trying to predict. Collaborative filtering yields great results when relevant data is available. However, it has one major drawback. When making predictions for either new items or new users performance suffers dramatically, generally called the cold-start problem. In such cases a content-based approach would fare as it only relies on an items/users characteristic when generating recommendations.
Hybrid recommender system
In general, hybrid recommender systems combine two or more recommendation techniques with the main goal of gaining better performance. The most popular approach is to combine collaborative filtering and content-based with state-of-art where both user feedbacks and content or demographic features are used. Hybrid recommender systems usually give higher accuracy than collaborative filtering or content-based models on their own. The main reason is that hybrid models are often more capable of addressing the cold-start problem. When there are no or only a few observations for a user or an item the content or demographic data can be used to make a prediction.
Today you have learned most common types of recommender systems, as well as the difference between implicit and explicit data. If you are interested to understand how content-based or collaborative filtering algorithms work in more detail, please leave a comment and we will be happy to include it in a future Featured Blog Friday post.