-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Spark item to item recommenders #1809
Conversation
@@ -488,3 +524,72 @@ def recommend_k_items( | |||
) | |||
else: | |||
raise ValueError("No cache_path specified") | |||
|
|||
def get_topk_most_similar_users(self, test, user, top_k=10): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question, have you checked that the results between the CPU version and this one are the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are using the same test cases as for the CPU version.
# compute item frequencies | ||
self.item_frequencies = item_cooccurrence.filter( | ||
F.col("i1") == F.col("i2") | ||
).select(F.col("i1").alias("item_id"), F.col("value").alias("frequency")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question to @simonzhaoms will the item-item function have any relationship with the work you are doing about the new similarities?
if not items: | ||
raise ValueError("Not implemented") | ||
|
||
return self.item_frequencies.orderBy("frequency", ascending=False).limit(top_k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the sarplus tests are triggered, but there is a badge in the main page that is failing: https://github.com/microsoft/recommenders/tree/staging/contrib/sarplus @simonzhaoms do you know what is the problem?
Description
Implemented get_topk_most_similar_users() and get_popularity_based_topk() for SAR+.
Related Issues
Checklist:
staging branch
and not tomain branch
.