forked from DeepRec-AI/DeepRec
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Embedding] Add a list of fused embedding ops. (DeepRec-AI#2)
1. Added the Python API of the fused embedding. 2. Add fill_empty_row and prune_invalid_id to fused embedding lookup 3. Add fused embedding modelzoo perf test benchmark Co-authored-by: Randy Wang <ruotongw@nvidia.com>
- Loading branch information
Showing
46 changed files
with
4,084 additions
and
1,099 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
*.csv | ||
*/result/model_* | ||
record.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# DLRM with GPU Fused Embedding | ||
|
||
The model structure, hyper params, dataset, etc, are all same to [DLRM](../../../DLRM/README.md). Please follow the instruction there to prepare, setup and run the model. | ||
|
||
The only difference is that this model use GPU Fused Embedding to acclerate the lookup process. Only change is: | ||
|
||
```python | ||
categorical_embedding_column = tf.feature_column.embedding_column( | ||
categorical_column, dimension=16, combiner='mean', | ||
do_fusion=True) | ||
``` | ||
|
||
## Benchmark | ||
|
||
On A100-80GB-PCIE GPU, with 8 cores AMD EPYC 7232P CPU @ 3.20GHz. Average of 5000 iterations. The perf boost: | ||
|
||
| | Avg Time per Iteration | | ||
| ------- | ---------------------- | | ||
| Unfused | 37.15 ms | | ||
| Fused | 31.43 ms | | ||
| SpeedUp | 1.18x | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Dataset | ||
## Prepare dataset | ||
Put data file **train.csv & eval.csv** into ./data/ | ||
|
||
Download Kaggle Display Advertising Challenge Dataset from http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/ | ||
|
||
The evaluation dataset for accuracy measurement is not available in the above link can be downloaded from https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv | ||
|
||
Download the train dataset(in csv format) from https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Result | ||
Checkpoint & timeline file are default saved in this folder. |
Oops, something went wrong.