Name		Name	Last commit message	Last commit date
parent directory ..
data		data
script		script
DockerFile		DockerFile
README.md		README.md
train.py		train.py

README.md

DIEN

DIEN

Model Structure

Deep Interest Evolution Network(DIEN) is proposed by Alibaba in 2018.11 which is a click-through rate(CRT) prediction model for e-commerce industry, focusing on capturing temporal interests from the user's historical behavior sequence.

Usage

Stand-alone Training

Please prepare the data set first.
Create a docker image by DockerFile.
Choose DockerFile corresponding to DeepRec(Pending) or Google tensorflow.
```
docker build -t DeepRec_Model_Zoo_DIEN_training:v1.0 .
```

Run a docker container.

docker run -it DeepRec_Model_Zoo_DIEN_training:v1.0 /bin/bash

Training.
```
cd /root/
python train.py
```
Use argument --bf16 to enable DeepRec BF16 in deep model.
```
python train.py --bf16
```
Use arguments to set up a custom configuation:
- --data_location: Full path of train & eval data, default is ./data.
- --output_dir: Full path to output directory for logs and saved model, default is ./result.
- --steps: Set the number of steps on train dataset. Default will be set to 10 epoch.
- --batch_size: Batch size to train. Default is 512.
- --timeline: Save steps of profile hooks to record timeline, zero to close, defualt to 0.
- --save_steps: Set the number of steps on saving checkpoints, zero to close. Default will be set to 0.
- --bf16: Enable DeepRec BF16 feature in DeepRec. Use FP32 by default.
- --no_eval: Do not evaluate trained model by eval dataset.

Benchmark

Standing-alone training

Test Environment

The benchmark is performed on the Alibaba Cloud ECS general purpose instance family with high clock speeds - hfg7.

Hardware
- CPU: Intel(R) Xeon(R) Platinum 8369HB CPU @ 3.30GHz
- vCPU(s): 8
- Socket(s): 1
- Core(s) per socket: 4
- Thread(s) per core: 2
- Memory: 32G
Software
- kernel: 4.18.0-305.3.1.el8.x86_64
- OS: CentOS 8.4.2105
- GCC: 8.4.1
- Docker: 20.10.8
- Python: 3.6.9

Performance Result

	Framework	DType	Accuracy	AUC	Globalsetp/Sec
DLRM	Community TensorFlow	FP32
	DeepRec w/ oneDNN	FP32
	DeepRec w/ oneDNN	FP32+BF16

Community TensorFlow version is v1.15.

Dataset

Amazon Dataset Books dataset is used as benchmark dataset.

Prepare

Put data file into ./data/
For details of Data download, see Data Preparation

Fields

cat_voc.pkl: Contain a list of book categories.
mid_voc.pkl: Contain a list of item id(book id).
uid_voc.pkl: Contain a list of user id.
reviews-info: Contain a list of user's review.
Each piece of data is as: <user id> <item id> <rating score> <timestamp>
item-info: Contain mapping relationship between item id and categories.
<item id> <categories>
local_train_splitByUser & local_test_splitByUser: Train and evaluate dataset, consist of user id, item info, user's historical behavior.
Each piece of data is as: <label> <user id> <item id> <categories> <history item id list> <history item categories list>
The history data are splitted by '�'

Processing

Reviews are regard as behaviors and those from one user are sort by time. Assuming user u has T behaviors, the first T-1 behaviors are used to predict whether user u will write the T-th review.

TODO LIST

Distributed training and benchmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DIEN

DIEN

README.md

DIEN

Model Structure

Usage

Stand-alone Training

Benchmark

Standing-alone training

Test Environment

Performance Result

Dataset

Prepare

Fields

Processing

TODO LIST

Files

DIEN

Directory actions

More options

Directory actions

More options

Latest commit

History

DIEN

Folders and files

parent directory

README.md

DIEN

Model Structure

Usage

Stand-alone Training

Benchmark

Standing-alone training

Test Environment

Performance Result

Dataset

Prepare

Fields

Processing

TODO LIST