PaddlePaddle · frankwhzhang · May 16, 2022 · Apr 23, 2022 · Apr 23, 2022 · Apr 23, 2022
diff --git a/README_CN.md b/README_CN.md
@@ -117,7 +117,6 @@ python -u tools/static_trainer.py -m models/rank/dnn/config.yaml #  静态图训
 
 <h2 align="center">支持模型列表</h2>
 
-
   |   方向   |                                                                                     模型                                                                                      | 在线环境 | 分布式CPU | 分布式GPU | 支持版本| 论文                                                                                                                                                                                                        |
   |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------:| :-----------------------------------------------------------------------: | :-----: | :-------: | :-------: |:-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
   | 内容理解 |                 [TextCnn](models/contentunderstanding/textcnn/)([文档](https://paddlerec.readthedocs.io/en/latest/models/contentunderstanding/textcnn.html))                  |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238415)  |       ✓     |     x     | >=2.1.0 | [EMNLP 2014][Convolutional neural networks for sentence classication](https://www.aclweb.org/anthology/D14-1181.pdf)                                                                                                    |
@@ -172,6 +171,7 @@ python -u tools/static_trainer.py -m models/rank/dnn/config.yaml #  静态图训
   |   排序   |                                                                        [DCN_V2](models/rank/dcn_v2/)                                                                        |  -  |       ✓     |     ✓     | >=2.1.0 | [WWW 2021][DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https://arxiv.org/pdf/2008.13535v2.pdf)|
   |   排序   |                                                                          [AITM](models/rank/aitm/)                                                                          |  -  |       ✓     |     ✓     | >=2.1.0 | [KDD 2021][Modeling the Sequential Dependence among Audience Multi-step Conversions withMulti-task Learning in Targeted Display Advertising](https://arxiv.org/pdf/2105.08489v2.pdf)  |
   |   排序   |                  [DSIN](models/rank/dsin/)                                                                          |  -  |       ✓     |     ✓     | >=2.1.0 | [IJCAI 2019][Deep Session Interest Network for Click-Through Rate Prediction](https://arxiv.org/pdf/1905.06482v1.pdf)  |
+  |   排序   |                     [SIGN](models/rank/sign/)([文档](https://paddl7erec.readthedocs.io/en/latest/models/rank/sign.html))                                                     |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3869111)  |       ✓     |     ✓     | >=2.1.0 | [AAAI 2021][Detecting Beneficial Feature Interactions for Recommender Systems](https://arxiv.org/pdf/2008.00404v6.pdf)                             |
   |  多任务  |                                  [PLE](models/multitask/ple/)([文档](https://paddlerec.readthedocs.io/en/latest/models/multitask/ple.html))                                   |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238938)  |       ✓     |     ✓     |  >=2.1.0 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)                                                              |
   |  多任务  |                                 [ESMM](models/multitask/esmm/)([文档](https://paddlerec.readthedocs.io/en/latest/models/multitask/esmm.html))                                 |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238583)  |       ✓     |     ✓     | >=2.1.0 | [SIGIR 2018][Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate](https://arxiv.org/abs/1804.07931)                                                              |
   |  多任务  |                                 [MMOE](models/multitask/mmoe/)([文档](https://paddlerec.readthedocs.io/en/latest/models/multitask/mmoe.html))                                 |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238934)  |       ✓     |     ✓     | >=2.1.0 | [KDD 2018][Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts](https://dl.acm.org/doi/abs/10.1145/3219819.3220007)                                                       |

diff --git a/README_EN.md b/README_EN.md
@@ -162,6 +162,7 @@ python -u tools/static_trainer.py -m models/rank/dnn/config.yaml #  Training wit
   |   Rank   |                     [DCN_V2](models/rank/dcn_v2/)                     |  -  |       ✓     |     ✓     | >=2.1.0 | [WWW 2021][DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https://arxiv.org/pdf/2008.13535v2.pdf)|
   |   Rank   |                                                                          [AITM](models/rank/aitm/)                                                                          |  -  |       ✓     |     ✓     | >=2.1.0 | [KDD 2021][Modeling the Sequential Dependence among Audience Multi-step Conversions withMulti-task Learning in Targeted Display Advertising](https://arxiv.org/pdf/2105.08489v2.pdf)  |
   |   Rank   |                  [DSIN](models/rank/dsin/)                                                                          |  -  |       ✓     |     ✓     | >=2.1.0 | [IJCAI 2019][Deep Session Interest Network for Click-Through Rate Prediction](https://arxiv.org/pdf/1905.06482v1.pdf)  |
+  |   Rank   |                     [SIGN](models/rank/sign/)([doc](https://paddlerec.readthedocs.io/en/latest/models/rank/sign.html))                     |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3869111)  |       ✓     |     ✓     | >=2.1.0 | [AAAI 2021][Detecting Beneficial Feature Interactions for Recommender Systems](https://arxiv.org/pdf/2008.00404v6.pdf) |
   |      Multi-Task       |                  [PLE](models/multitask/ple/)<br>([doc](https://paddlerec.readthedocs.io/en/latest/models/multitask/ple.html))                   |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238938)  |     ✓     |     ✓     |  >=2.1.0 | [RecSys 2020][Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations](https://dl.acm.org/doi/abs/10.1145/3383313.3412236)                                                              |
   |      Multi-Task       |                  [ESMM](models/multitask/esmm/)<br>([doc](https://paddlerec.readthedocs.io/en/latest/models/multitask/esmm.html))                   |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238583)  |         ✓         |     ✓     |      >=2.1.0     | [SIGIR 2018][Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate](https://arxiv.org/abs/1804.07931)                                                              |
   |      Multi-Task       |                  [MMOE](models/multitask/mmoe/)<br>([doc](https://paddlerec.readthedocs.io/en/latest/models/multitask/mmoe.html))                   |  [Python CPU/GPU](https://aistudio.baidu.com/aistudio/projectdetail/3238934)  |         ✓         |     ✓     |      >=2.1.0     | [KDD 2018][Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts](https://dl.acm.org/doi/abs/10.1145/3219819.3220007)                                                       |

diff --git a/contributor.md b/contributor.md
@@ -20,5 +20,6 @@
   |                     [FLEN](models/rank/flen/)                     |  [LinJayan](https://github.com/LinJayan)  |    https://github.com/PaddlePaddle/PaddleRec/pull/685   | 论文复现赛第五期 |
   |                     [MHCN](models/recall/mhcn/)                     |  [Andy1314Chen](https://github.com/Andy1314Chen)  |    https://github.com/PaddlePaddle/PaddleRec/pull/679   | 论文复现赛第五期 |
   |                     [DCN_V2](models/rank/dcn_v2/)                     |  [LinJayan](https://github.com/LinJayan)  |    https://github.com/PaddlePaddle/PaddleRec/pull/677   | 论文复现赛第五期 |
+  |                     [SIGN](models/rank/sign/)                     |  [BamLubi](https://github.com/BamLubi)  |    https://github.com/PaddlePaddle/PaddleRec/pull/748   | 论文复现赛第六期 |
 
 </div> 
diff --git a/datasets/sign/run.sh b/datasets/sign/run.sh
@@ -0,0 +1,2 @@
+wget https://blog.cos.bamlubi.cn/Paddle-SIGN/ml-tag.zip
+unzip ml-tag.zip
diff --git a/doc/imgs/sign.png b/doc/imgs/sign.png
diff --git a/doc/source/models/index.rst b/doc/source/models/index.rst
@@ -76,6 +76,7 @@ PaddleRec 模型库
    rank/naml.md
    rank/wide_deep.md
    rank/xdeepfm.md
+   rank/rank.md
 
 
 重排序

diff --git a/doc/source/models/rank/sign.md b/doc/source/models/rank/sign.md
@@ -0,0 +1,142 @@
+# sign (Detecting Beneficial Feature Interactions for Recommender Systems)
+
+代码请参考：[sign](https://github.com/PaddlePaddle/PaddleRec/tree/master/models/rank/sign)  
+如果我们的代码对您有用，还请点个star啊~  
+
+## 内容
+
+- [模型简介](#模型简介)
+- [数据准备](#数据准备)
+- [运行环境](#运行环境)
+- [快速开始](#快速开始)
+- [模型组网](#模型组网)
+- [效果复现](#效果复现)
+- [进阶使用](#进阶使用)
+- [FAQ](#FAQ)
+
+## 模型简介
+特征交叉通过将两个或多个特征相乘，来实现样本空间的非线性变换，提高模型的非线性能力，其在推荐系统领域中可以显著提高准确率。以往的研究考虑了所有特征之间的交叉，但是某些特征交叉与推荐结果的相关性不大，其引入的噪声会降低模型的准确率。因此论文[《Detecting Beneficial Feature Interactions for Recommender Systems》]( https://arxiv.org/pdf/2008.00404v6.pdf )中提出了一种利用图神经网络自动发现有意义特征交叉的模型L0-SIGN。
+
+作者使用图神经网络建模每个样本的特征，将特征交叉与图中的边相联系，用GNN的关系推理能力对特征交叉进行建模。使用L0正则化的边预测来限制图中检测的边的数量，以此进行有意义特征交叉的检测。
+
+本模型实现了下述论文中的 SIGN 模型：
+
+```text
+@inproceedings{su2021detecting,
+  title={Detecting Beneficial Feature Interactions for Recommender Systems},
+  author={Su, Yixin and Zhang, Rui and Erfani, Sarah and Xu, Zhenghua},
+  booktitle={Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI)},
+  year={2021}
+}
+```
+
+## 数据准备
+
+论文使用了4个开源数据集，`DBLP_v1`、`frappe`、`ml-tag`、`twitter`，这里使用`ml-tag`验证模型效果，在模型目录的data目录中准备了快速运行的示例数据，若需要使用全量数据可以参考下方[效果复现](#效果复现)部分。
+该数据集专注于电影标签推荐，每个数据实例都代表一个图，数据格式如下：
+
+```shell
+# 电影标签 用户ID 电影ID 电影ID
+0.0 24 25 26
+1.0 62 63 64
+```
+## 运行环境
+PaddlePaddle>=2.0
+
+pgl>=2.2.0
+
+python 2.7/3.5/3.6/3.7
+
+os : windows/linux/macos 
+
+## 快速开始
+本文提供了样例数据可以供您快速体验，在任意目录下均可执行。在sign模型目录的快速执行命令如下： 
+```bash
+# 准备环境: 安装pgl
+pip install pgl
+
+# 进入模型目录
+cd PaddleRec/models/rank/sign	# 在任意目录均可运行
+# 动态图训练
+python -u ../../../tools/trainer.py -m config.yaml			# sample数据运行
+python -u ../../../tools/trainer.py -m config_bigdata.yaml	# 全量数据运行
+# 动态图预测
+python -u ../../../tools/infer.py -m config.yaml			# sample数据预测
+python -u ../../../tools/infer.py -m config_bigdata.yaml	# 全量数据预测
+```
+
+## 模型组网
+
+L0-SIGN模型有两个模块，一个是L0边预估模块，通过矩阵分解图的邻接矩阵进行边的预估，一个是图分类SIGN模块。模型的主要组网结构如图1所示，与 `net.py` 中的代码一一对应 ：
+
+<p align="center">
+<img align="center" src="../../../imgs/sign.png">
+<p>
+
+## 效果复现
+
+为了方便使用者能够快速的跑通每一个模型，我们在每个模型下都提供了样例数据。如果需要复现readme中的效果,请按如下步骤依次操作即可。
+在全量数据下模型的指标如下：
+
+| 模型 | auc    | acc    | batch_size | epoch_num | Time of each epoch |
+| :--- | :----- | :----- | :--------- | :-------- | :----------------- |
+| SIGN | 0.9418 | 0.8927 | 1024       | 40        | 约18分钟           |
+
+1. 确认您当前所在目录为PaddleRec/models/rank/sign
+2. 进入PaddleRec/datasets/sign目录下，执行`run.sh`脚本，会从国内源的服务器上下载sign全量数据集，并解压到指定文件夹。
+
+``` bash
+cd ../../../datasets/sign
+bash run.sh
+```
+
+3. 安装依赖
+
+```shell
+# 安装pgl
+pip install pgl
+```
+
+3. 切回模型目录,执行命令运行全量数据
+
+```bash
+cd - # 切回模型目录
+# 动态图训练
+python -u ../../../tools/trainer.py -m config_bigdata.yaml # 全量数据运行
+python -u .././../tools/infer.py -m config_bigdata.yaml # 全量数据预测
+```
+
+## 进阶使用
+
+本模型支持飞桨训推一体认证 (Training and Inference Pipeline Certification(TIPC)) 信息和测试工具，方便用户查阅每种模型的训练推理部署打通情况，并可以进行一键测试。
+
+使用本工具，可以测试不同功能的支持情况，以及预测结果是否对齐，测试流程概括如下：
+
+1. 运行`prepare.sh`准备测试所需数据和模型；
+2. 运行测试脚本`test_train_inference_python.sh`，产出log，由log可以看到不同配置是否运行成功；
+
+测试单项功能仅需两行命令，命令格式如下：
+
+```shell
+# 功能：准备数据
+# 格式：bash + 运行脚本 + 参数1: 配置文件选择 + 参数2: 模式选择
+# 模式选择 [Mode] = 'lite_train_lite_infer' | 'whole_train_whole_infer' | 'whole_infer' | 'lite_train_whole_infer'
+bash test_tipc/prepare.sh configs/[model_name]/[params_file_name] [Mode]
+
+# 功能：运行测试
+# 格式：bash + 运行脚本 + 参数1: 配置文件选择 + 参数2: 模式选择
+bash test_tipc/test_train_inference_python.sh configs/[model_name]/[params_file_name]  [Mode]
+```
+
+例如，测试基本训练预测功能的`lite_train_lite_infer`模式，运行：
+
+```shell
+# 确保当前目录在 PaddleRec
+# cd PaddleRec
+# 准备数据
+bash test_tipc/prepare.sh ./test_tipc/configs/sign/train_infer_python.txt 'lite_train_lite_infer'
+# 运行测试
+bash test_tipc/test_train_inference_python.sh ./test_tipc/configs/sign/train_infer_python.txt 'lite_train_lite_infer'
+```
+
+## FAQ
diff --git a/doc/source/readme.md b/doc/source/readme.md
@@ -50,3 +50,4 @@
 [autofis](https://paddlerec.readthedocs.io/en/latest/models/rank/autofis.html)  
 [aitm](https://paddlerec.readthedocs.io/en/latest/models/rank/aitm.html)  
 [dsin](https://paddlerec.readthedocs.io/en/latest/models/rank/dsin.html)  
+[sign](https://paddlerec.readthedocs.io/en/latest/models/rank/sign.html)
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		wget https://blog.cos.bamlubi.cn/Paddle-SIGN/ml-tag.zip
		unzip ml-tag.zip