[Embedding] Add a list of fused embedding ops. (DeepRec-AI#2)

1. Added the Python API of the fused embedding. 2. Add fill_empty_row and prune_invalid_id to fused embedding lookup 3. Add fused embedding modelzoo perf test benchmark Co-authored-by: Randy Wang <ruotongw@nvidia.com>
maciekpac · Dec 27, 2021 · 441c972 · 441c972
1 parent e7fc3d0
commit 441c972
Show file tree

Hide file tree

Showing 46 changed files with 4,084 additions and 1,099 deletions.
diff --git a/cibuild/gpu-ut.sh b/cibuild/gpu-ut.sh
@@ -165,5 +165,6 @@ export TF_BUILD_BAZEL_TARGET="$TF_BUILD_BAZEL_TARGET "\
 "-//tensorflow/python/tools/api/generator:output_init_files_test "\
 "-//tensorflow/python/tpu:datasets_test "\
 "-//tensorflow/python/training/tracking:util_xla_test_gpu "\
+"-//tensorflow/core/kernels:fused_embedding_ops_test_gpu"
 
 bazel test -c opt --config=cuda --verbose_failures --run_under=//tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute  --test_timeout="300,450,1200,3600" --local_test_jobs=2  -- $TF_BUILD_BAZEL_TARGET
diff --git a/modelzoo/features/GPUFusedEmbedding/.gitignore b/modelzoo/features/GPUFusedEmbedding/.gitignore
@@ -0,0 +1,3 @@
+*.csv
+*/result/model_*
+record.py
diff --git a/modelzoo/features/GPUFusedEmbedding/DLRM/README.md b/modelzoo/features/GPUFusedEmbedding/DLRM/README.md
@@ -0,0 +1,21 @@
+# DLRM with GPU Fused Embedding
+
+The model structure, hyper params, dataset, etc, are all same to [DLRM](../../../DLRM/README.md). Please follow the instruction there to prepare, setup and run the model.
+
+The only difference is that this model use GPU Fused Embedding to acclerate the lookup process. Only change is:
+
+```python
+categorical_embedding_column = tf.feature_column.embedding_column(
+    categorical_column, dimension=16, combiner='mean',
+    do_fusion=True)
+```
+
+## Benchmark
+
+On A100-80GB-PCIE GPU, with 8 cores AMD EPYC 7232P CPU @ 3.20GHz. Average of 5000 iterations. The perf boost:
+
+|         | Avg Time per Iteration |
+| ------- | ---------------------- |
+| Unfused | 37.15 ms               |
+| Fused   | 31.43 ms               |
+| SpeedUp | 1.18x                  |
diff --git a/modelzoo/features/GPUFusedEmbedding/DLRM/data/README.md b/modelzoo/features/GPUFusedEmbedding/DLRM/data/README.md
@@ -0,0 +1,10 @@
+# Dataset
+## Prepare dataset
+Put data file **train.csv & eval.csv** into ./data/
+
+Download Kaggle Display Advertising Challenge Dataset from http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/
+
+The evaluation dataset for accuracy measurement is not available in the above link can be downloaded from https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv
+
+Download the train dataset(in csv format) from https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv
+
diff --git a/modelzoo/features/GPUFusedEmbedding/DLRM/result/README.md b/modelzoo/features/GPUFusedEmbedding/DLRM/result/README.md
@@ -0,0 +1,2 @@
+# Result
+Checkpoint & timeline file are default saved in this folder.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		# Result
		Checkpoint & timeline file are default saved in this folder.