diff --git a/README.md b/README.md index 927afe31480..53cca5c5c83 100644 --- a/README.md +++ b/README.md @@ -95,13 +95,13 @@ $ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux #### Image for CPU ``` -alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04 +alideeprec/deeprec-release:deeprec2306-cpu-py38-ubuntu20.04 ``` #### Image for GPU CUDA11.6 ``` -alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04 +alideeprec/deeprec-release:deeprec2306-gpu-py38-cu116-ubuntu20.04 ``` *** diff --git a/RELEASE.md b/RELEASE.md index d41d9e569ad..43e03bc2b49 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,3 +1,87 @@ +# Release r1.15.5-deeprec2306 + +## **Major Features and Improvements** + +### **Embedding** + +- Support StaticGPUHashMap to optimize EmbeddingVariable in inference. +- Update logic of GroupEmbedding in feature_column API. +- Refine APIs for foward-backward optimization. +- Move insertions of new features into the backward process when lti-tier storage. +- Move insertion of new features into the backward ops. +- Modify calculation logic of embedding lookup sparse combiner. +- Add memory and performance tests of EmbeddingVariable. + +### **Graph & Grappler Optimization** + +- Support IteratorGetNext for SmartStage as a starting node for searching. +- Reimplement PrefetchRunner in C++. + +### **Runtime Optimization** + +- Dispatch expensive ops via multiple threads in theadpool. +- Enable multi-stream in session_group by default. +- Support for loading saved_model with device information when use p and multi_stream. +- Make ARENA_ARRAY_SIZE to be configurable. +- Optimize EV allocator performance. +- Integrate HybridBackend in collective training mode. + +### **Ops & Hardware Acceleration** + +- Disable MatMul fused with LeakyRule when MKL is disabled. + +### **Serving** + +- Clear virtual_device configurations before load new checkpoint. + +### **Environment & Build** + +- Update docker images in user documents. +- Update DEFAULT_CUDA_VERSION and DEFAULT_CUDNN_VERSION in configure.py. +- Move thirdparties from WORKSPACE to workspace.bzl. +- Update urls corresponding to colm, ragel, aliyun-oss-sdk and uuid. +- Update default TF_CUDA_COMPUTE_CAPABILITIES to 7.0,7.5,8.0,8.6. +- Update SparseOperationKit to v23.5.01 and docker file. + +### **BugFix** + +- Fix issue of missing params while constructing the ngScope. +- Fix memory leak to avoid OOM. +- Fix shape validation in API shared_embedding_columns. +- Fix the device placement bug of stage_subgraph_on_cpu in distributed. +- Fix hung issue when using both SOK and SmartStaged simultaneously. +- Fix bug: init global_step before saving variables +- Fix bug: reserve input nodes, clear saver devices on demand. +- Fix memory leak when a graph node is invalid. + +### **ModelZoo** + +- Add examples and docs to demonstrate Collective Training. +- Update documents and config files for modelzoo benchmark. +- Update modelzoo README. + +### **Tool & Documents** + +- Update cases of configure TF_CUDA_COMPUTE_CAPABILITIES for H100. +- Update COMMITTERS.md. +- Update device placement documents. +- Update document for SmartStage. +- Update session_group documents. +- Update the download link of the library that Processor depends on. +- Update sok to 1.20. + +More details of features: [https://deeprec.readthedocs.io/zh/latest/](url) + +## **Release Images** + +### **CPU Image** + +`alideeprec/deeprec-release:deeprec2306-cpu-py38-ubuntu20.04` + +### **GPU Image** + +`alideeprec/deeprec-release:deeprec2306-gpu-py38-cu116-ubuntu20.04` + # Release r1.15.5-deeprec2304 ## **Major Features and Improvements** diff --git a/docs/docs_en/DeepRec-Compile-And-Install.md b/docs/docs_en/DeepRec-Compile-And-Install.md index 0a170177353..83ba4854b9f 100644 --- a/docs/docs_en/DeepRec-Compile-And-Install.md +++ b/docs/docs_en/DeepRec-Compile-And-Install.md @@ -111,7 +111,7 @@ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux_x x86_64: ``` -alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04 +alideeprec/deeprec-release:deeprec2306-cpu-py38-ubuntu20.04 ``` arm64: @@ -122,5 +122,5 @@ alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu22.04-arm64 **GPU Image with CUDA 11.6** ``` -alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04 +alideeprec/deeprec-release:deeprec2306-gpu-py38-cu116-ubuntu20.04 ``` diff --git a/docs/docs_en/Estimator-Compile-And-Install.md b/docs/docs_en/Estimator-Compile-And-Install.md index cdc04044875..73b6a36f318 100644 --- a/docs/docs_en/Estimator-Compile-And-Install.md +++ b/docs/docs_en/Estimator-Compile-And-Install.md @@ -40,7 +40,7 @@ DeepRec provide new distributed protocols such as grpc++ and star_server, which Source Code: [https://github.com/DeepRec-AI/estimator](https://github.com/DeepRec-AI/estimator) -Develop Branch:master, Latest Release Branch: deeprec2304 +Develop Branch:master, Latest Release Branch: deeprec2306 ## Estimator Build diff --git a/docs/docs_en/TFServing-Compile-And-Install.md b/docs/docs_en/TFServing-Compile-And-Install.md index 8ced3628673..346a848ca74 100644 --- a/docs/docs_en/TFServing-Compile-And-Install.md +++ b/docs/docs_en/TFServing-Compile-And-Install.md @@ -39,7 +39,7 @@ We provide optimized TFServing which could highly improve performance in inferen Source Code: [https://github.com/DeepRec-AI/serving](https://github.com/DeepRec-AI/serving) -Develop Branch: master, Latest Release Branch: deeprec2304 +Develop Branch: master, Latest Release Branch: deeprec2306 ## TFServing Build diff --git a/docs/docs_zh/DeepRec-Compile-And-Install.md b/docs/docs_zh/DeepRec-Compile-And-Install.md index 20df07aa252..08d249f8eeb 100644 --- a/docs/docs_zh/DeepRec-Compile-And-Install.md +++ b/docs/docs_zh/DeepRec-Compile-And-Install.md @@ -108,7 +108,7 @@ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux_x x86_64: ``` -alideeprec/deeprec-release:deeprec2304-cpu-py38-ubuntu20.04 +alideeprec/deeprec-release:deeprec2306-cpu-py38-ubuntu20.04 ``` arm64: @@ -119,7 +119,7 @@ alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu22.04-arm64 **GPU CUDA11.6镜像** ``` -alideeprec/deeprec-release:deeprec2304-gpu-py38-cu116-ubuntu20.04 +alideeprec/deeprec-release:deeprec2306-gpu-py38-cu116-ubuntu20.04 ``` ## DeepRec Processor编译打包 diff --git a/docs/docs_zh/Estimator-Compile-And-Install.md b/docs/docs_zh/Estimator-Compile-And-Install.md index 332b96e6086..e5455aae91a 100644 --- a/docs/docs_zh/Estimator-Compile-And-Install.md +++ b/docs/docs_zh/Estimator-Compile-And-Install.md @@ -40,7 +40,7 @@ 代码库:[https://github.com/DeepRec-AI/estimator](https://github.com/DeepRec-AI/estimator) -开发分支:master,最新Release分支:deeprec2304 +开发分支:master,最新Release分支:deeprec2306 ## Estimator编译 diff --git a/docs/docs_zh/TFServing-Compile-And-Install.md b/docs/docs_zh/TFServing-Compile-And-Install.md index 27bfc864e4e..0c76400e6c6 100644 --- a/docs/docs_zh/TFServing-Compile-And-Install.md +++ b/docs/docs_zh/TFServing-Compile-And-Install.md @@ -39,7 +39,7 @@ 代码库:[https://github.com/DeepRec-AI/serving](https://github.com/DeepRec-AI/serving) -开发分支:master,最新Release分支:deeprec2304 +开发分支:master,最新Release分支:deeprec2306 ## TFServing编译&打包