Create content_based_filtering.py (Anjaliavv51#387)

Description <!–Please include a brief description of the changes–> Added data collection and recommendation model features. The data collection tracks user interactions (clicks, views, purchases) and stores them in a database. The recommendation model uses collaborative filtering and content-based filtering to provide personalized recommendations. Related Issues <!–Cite any related issue(s) this pull request addresses. If none, simply state “None”–> Closes Anjaliavv51#123 (Replace with the actual issue number if applicable) Type of PR  [ ] Feature Screenshots / videos (if applicable) <!–Attach any relevant screenshots or videos demonstrating the changes–> N/A Checklist  [X] I have gone through the contributing guide [X] I have updated my branch and synced it with project main branch before making this PR [X] I have performed a self-review of my code [X] I have tested the changes thoroughly before submitting this pull request. [X] I have provided relevant issue numbers, screenshots, and videos after making the changes. [X] I have commented my code, particularly in hard-to-understand areas. Additional context: <!–Include any additional information or context that might be helpful for reviewers.–> This feature enhances user experience by providing personalized recommendations and insights into user behavior, which can inform marketing strategies and inventory management.
Naveenkumar30838 · Oct 13, 2024 · 7acf400 · 7acf400
2 parents c7249ab + 2002819
commit 7acf400
Showing 1 changed file with 28 additions and 0 deletions.
diff --git a/content_based_filtering.py b/content_based_filtering.py
@@ -0,0 +1,28 @@
+from sklearn.feature_extraction.text import TfidfVectorizer
+from sklearn.metrics.pairwise import linear_kernel
+
+# Example item descriptions
+items = [
+    {'id': 101, 'description': 'Vintage camera from the 1950s'},
+    {'id': 102, 'description': 'Classic vinyl record'},
+    {'id': 103, 'description': 'Retro gaming console'}
+]
+
+# Create TF-IDF matrix
+tfidf = TfidfVectorizer(stop_words='english')
+tfidf_matrix = tfidf.fit_transform([item['description'] for item in items])
+
+# Compute cosine similarity
+cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)
+
+# Function to get recommendations
+def get_recommendations(item_id, cosine_sim=cosine_sim):
+    idx = next(index for (index, d) in enumerate(items) if d["id"] == item_id)
+    sim_scores = list(enumerate(cosine_sim[idx]))
+    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
+    sim_scores = sim_scores[1:4]
+    item_indices = [i[0] for i in sim_scores]
+    return [items[i]['id'] for i in item_indices]
+
+# Example usage
+print(get_recommendations(101))