Recommendation System in Python


Recommendation system is a tool in python that suggests items or content to users based on their preferences and past behaviors. This technology utilizes algorithms to predict users' future preferences, thereby providing them with the most relevant content.

The scope of this system is vast, with widespread use in various industries such as e-commerce, streaming services, and social media. Products, movies, music, books, and more can all be recommended through these systems. The provision of personalized recommendations not only helps foster customer engagement and loyalty but can also boost sales.

Types of Recommendation Systems

Content-based recommendation systems

These operate on the notion that users can receive recommendations for items comparable to those they have previously engaged with. This kind of system utilizes algorithms to pinpoint items that closely resemble a user's preferences, with the objective of creating a list of suggestions tailored to the user. In this setup, the algorithm analyzes data linked to the item, such as its qualities and user ratings, to determine what to propose.

Algorithm

  • Step 1 − Import necessary libraries

  • Step 2 − Load the dataset

  • Step 3 − Preprocess data

  • Step 4 − Compute the similarity matrix

  • Step 5 − For each user −

    • Select items they have interacted with

    • For each item selected in step 5a −

      • Retrieve its similarity scores with all other items

      • Compute a weighted average of the similarity scores, using the user's ratings as weights

    • Sort items in descending order based on their weighted similarity scores

    • Recommend the top N items to the user

  • Step 6 − Return recommendations for all users.

Example

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Load data
data = pd.read_csv('movies.csv')

# Compute TF-IDF vectors for each movie
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(data['description'])

# Compute cosine similarity between all movies
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

# Function to get top 10 similar movies based on input movie
def get_recommendations(title):
   idx = data[data['title'] == title].index[0]
   sim_scores = list(enumerate(cosine_sim[idx]))
   sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
   sim_scores = sim_scores[1:11]
   movie_indices = [i[0] for i in sim_scores]
   return data.iloc[movie_indices]

# Example usage: get top 10 movies similar to 'The Godfather'
get_recommendations('The Godfather')

We load the movie data from a local CSV file to a dataframe. We compute the matrix by transforming the movie descriptions with the help of fit_transform() function and compute the cosine similarity matrix.

We then define a function that takes movie title as its argument and retrieves the index of the movie title if present in the dataframe.

We then create a list of tuples that contains the similarity score between the movie title that was passed as argument and all other movie titles. Each tuple consists of an index and its similarity score. Then we display the list of movie titles by indexing dataframe.

Output

                                title  \
783                 The Godfather   
1512          The Godfather: Part II   
1103                       Casino   
3509  Things to Do in Denver When   
1246                       Snatch   
3094             Road to Perdition   
2494                     Scarface   
1244                    Following   
2164                       Dancer   
2445        The Day of the Jackal   

Collaborative Filtering Recommendation Systems

Conversely, these rely on data from other users in order to produce recommendations. This variety of systems compares the preferences and behaviors of various users and then suggests items that other users who possess analogous tastes might fancy. Collaborative filtration is commonly more precise compared to content-based systems since it factors in numerous user opinions when producing recommendations.

Algorithm

  • Step 1 − Importing necessary library.

  • Step 2 − Loading ‘ratings.csv’ files where users rating is available.

  • Step 3 − Creating “user_item_matrix' to convert users rating data into matrix

  • Step 4 − Calculating similarity of user ratings using coisne similarity.

  • Step 5 − identifying similar users

  • Step 6 − Calculating average rating.

  • Step 7 − Select target user ID.

  • Step 8 − Printing movies ID and rating.

Example

import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

# Load data
ratings_data = pd.read_csv('ratings.csv')

# Create user-item matrix
user_item_matrix = pd.pivot_table(ratings_data, values='rating', index='userId', columns='movieId')

# Calculate cosine similarity between users
user_similarity = cosine_similarity(user_item_matrix)

# Get top n similar users for each user
def get_top_similar_users(similarity_matrix, user_index, n=10):
    similar_users = similarity_matrix[user_index].argsort()[::-1]
    return similar_users[1:n+1]

# Get recommended items for a user based on similar users
def get_recommendations(user_id, user_similarity, user_item_matrix, n=10):
   similar_users = get_top_similar_users(user_similarity, user_id, n)
   recommendations = user_item_matrix.iloc[similar_users].mean(axis=0).sort_values(ascending=False).head(n)
   return recommendations

# Example usage
user_id = 1
recommendations = get_recommendations(user_id, user_similarity, user_item_matrix)
print("Top 10 recommended movies for user", user_id)
print(recommendations)

Output

Top 10 recommended movies for user 1
movieId
1196        5.000000
50            5.000000
1210        5.000000
260          5.000000
1198        5.000000
2571        5.000000
527          5.000000
1197        5.000000
2762        5.000000
858          4.961538

Conclusion

Creating a recommendation system task can pose significant complexity for programmers, yet it is a valuable tool that can yield tremendous benefits. The utilization of Python to construct a recommendation system presents a variety of options that can streamline the creation and customization process. However, as with any coding endeavor, potential issues may arise when developing a recommendation system. Being aware of these typical complications and taking measures to address them is essential to ensure the success of the recommendation system.

Ultimately, it is critical to bear in mind that a recommendation system can be an immensely potent asset, making it worthwhile to invest the time and effort necessary to ensure it is properly constructed and functioning optimally.

Updated on: 07-Aug-2023

206 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements