Source code of Multi-Task Recommendations with Reinforcement Learning
Code for RetailRocket Dataset.
Google Drive link for processed RetailRocket data: https://drive.google.com/file/d/1THRWKttdpmcNaEc1DtKwxgYlV8RLMtV5/view?usp=sharing
-
layers: stores common network structures
- critic: critic network
- esmm: esmm(actor) network, can introduce other MTL models as actor inside slmodels
- layers: classical Embedding layers and MLP layers
-
slmodels: SL baseline models
-
agents: RL models
-
train: training-related configuration
-
env.py: offline sampling simulation environment
-
RLmain.py: main RL training program
-
SLmain.py: SL training main program
-
dataset
- rtrl:retrailrocket dataset(Convert to MDP format:)[timestamp,sessionid,itemid,pay,click], [itemid,feature1,feature2,..],6:2:2
python3 SLmain.py --model_name=esmm
python3 RLmain.py python3 SLmain.py --model_name=esmm --polish=1
test: best auc: 0.732444172986328 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 134/134 [00:07<00:00, 19.14it/s] task 0, AUC 0.7273702846096346, Log-loss 0.20675417715656488 task 1, AUC 0.7247954179346048, Log-loss 0.048957254763240504
Please cite with the below bibTex if you find it helpful to your research.
@inproceedings{liu2023multi,
title={Multi-Task Recommendations with Reinforcement Learning},
author={Liu, Ziru and Tian, Jiejie and Cai, Qingpeng and Zhao, Xiangyu and Gao, Jingtong and Liu, Shuchang and Chen, Dayou and He, Tonghao and Zheng, Dong and Jiang, Peng and others},
booktitle={Proceedings of the ACM Web Conference 2023},
pages={1273--1282},
year={2023}
}