Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FGCNN model #784

Merged
merged 15 commits into from
Jun 15, 2022
Merged

Add FGCNN model #784

merged 15 commits into from
Jun 15, 2022

Conversation

yoreG123
Copy link
Contributor

Add model and datatsets

PR changes

Add new model fgcnn;Add dataset criteo_fgcnn;Add fgcnn tipc config

@@ -0,0 +1,3 @@
wget --no-check-certificate https://paddlerec.bj.bcebos.com/datasets/fgcnn/datapro.zip
unzip -o datapro.zip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

更名为run.sh

runner:
train_data_dir: "data/trainlite"
train_reader_path: "reader" # importlib format
use_gpu: True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

demo数据下默认不开启gpu,然后demo训练时间需要在一分钟以内

# global settings

runner:
train_data_dir: "data/train"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全量数据需要从datasets目录中读取数据

use_gpu: True
use_auc: True
train_batch_size: 2000
epochs: 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

全量数据训练的epoch和readme中描述的不一致


# construct train forward phase
def train_forward(self, dy_model, metrics_list, batch_data, config):
# 稠密向量
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里注释是什么意思?

return self.embedding(inputs)


class FGCNNLayer(nn.Layer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

组网里多加点注释吧,可以让其他用户更明白的看懂代码


## 运行环境
PaddlePaddle>=2.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

额外的依赖要写明在这里

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以及依赖的安装方法

sh download.sh
cd ../..
mkdir ./models/rank/fgcnn/data/train
mkdir ./models/rank/fgcnn/data/test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用挪移到rank目录下,全量数据直接放在datasets目录下即可。在run.sh脚本中创建trian和test目录

在全量数据下模型的指标如下:
| 模型 | auc | batch_size | epoch_num| Time of each epoch |
| :------| :------ | :------ | :------| :------ |
| fgcnn | 0.8022 | 2000 | 10 | 约 3 小时 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

模型epoch信息和全量数据config_bigdata中没有对上

Copy link
Contributor

@yinhaofeng yinhaofeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还需要添加中英文的主页readme表格,contributor.md贡献者表格,doc/source下的目录结构和文档,

@frankwhzhang frankwhzhang merged commit d10fd27 into PaddlePaddle:master Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants