For this task, we are required to do the news articles recommendation to users based on a learning policy that learn users’ preference. We have firstly tried epsilon greedy algorithm, then linear UCB algorithm and eventually used hybrid UCB algorithm to pass the hard baseline. Epsilon greedy algorithm worked so bad that it only gave us the result around 0.03, therefore we tried linear UCB algorithm instead. In the beginning, we expect that the result will improve a lot when we change the method from explicitly distinguishing exploration and exploitation to using the confidence interval, however it didn’t improve that much and even can’t help us reach the easy baseline. Therefore, we tried hybrid UCB then, which eventually help us pass the hard baseline. Besides the advantages of hybrid UCB itself, we also applied 2 tricks that enable us to reach the hard baseline and are different from the hybrid UCB algorithm taught in the class. The first trick is the setting of payoff parameter r. Instead of just check if the recommendation is matched with log, we also used different r when it is matched with log, which can be the learning rate of the weighted vector essentially. This trick gave us a drastic improvement. Another trick is we didn’t follow the formula of alpha directly but use grid search to find the optimal value of alpha. This trick reduces the time for finding the optimal parameters and therefore is helpful for completing the task within the certain time constraint.
-
Notifications
You must be signed in to change notification settings - Fork 0
cwuu/DataMining-LearningFromLargeDataSet-Task4
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
ETH Zurich Fall 2017
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published