Babak/upgrade triton to v2.44.0 #5

babakbehzad · 2024-04-05T14:36:37Z

No description provided.

* Add test for Python BLS model loading API * Fix up

* Adding nested spans to OTel tracing + support of ensemble models

* Move multi-GPU dlpack test to a separate L0 test * Fix copyright * Fix up

* Upgrade OV to 2023.0.0 * Upgrade OV model gen script to 2023.0.0

* Add test to check the output memory type for onnx models * Remove unused import * Address comment

* Add testing for implicit state for PyTorch backend * Add testing for libtorch string implicit models * Fix CodeQL * Mention that libtorch backend supports implicit state * Fix CodeQL * Review edits * Fix output tests for PyTorch backend

Add test for uncompressed conda execution enviroments

* Fix expected instance name * Copyright year

* Fix name of client wheel to be looked for * Fix up

* Add pre-commit * Fix typos, exec/shebang, formatting * Remove clang-format * Update contributing md to include pre-commit * Update spacing in CONTRIBUTING * Fix contributing pre-commit link * Link to pre-commit install directions * Wording * Restore clang-format * Fix yaml spacing * Exclude templates folder for check-yaml * Remove unused vars * Normalize spacing * Remove unused variable * Normalize config indentation * Update .clang-format to enforce max line length of 80 * Update copyrights * Update copyrights * Run workflows on every PR * Fix copyright year * Fix grammar * Entrypoint.d files are not executable * Run pre-commit hooks * Mark not executable * Run pre-commit hooks * Remove unused variable * Run pre-commit hooks after rebase * Update copyrights * Fix README.md typo (decoupled) Co-authored-by: Ryan McCormick <rmccormick@nvidia.com> * Run pre-commit hooks * Grammar fix Co-authored-by: Ryan McCormick <rmccormick@nvidia.com> * Redundant word Co-authored-by: Ryan McCormick <rmccormick@nvidia.com> * Revert docker file changes * Executable shebang revert * Make model.py files non-executable * Passin is proper flag * Run pre-commit hooks on init_args/model.py * Fix typo in init_args/model.py * Make copyrights one line --------- Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

* Add test for sequence model instance update * Add gap for file timestamp update * Update test for non-blocking sequence update * Update documentation * Remove mentioning increase instance count case * Add more documentaion for scheduler update test * Update test for non-blocking batcher removal * Add polling due to async scheduler destruction * Use _ as private * Fix typo * Add docs on instance count decrease * Fix typo * Separate direct and oldest to different test cases * Separate nested tests in a loop into multiple test cases * Refactor scheduler update test * Improve doc on handling future test failures * Address pre-commit * Add best effort to reset model state after a single test case failure * Remove reset model method to make harder for chaining multiple test cases as one * Remove description on model state clean up

* Update README and versions for 2.36.0 / 23.07 * Update Dockerfile.win10.min * Fix formating issue * fix formating issue * Fix whitespaces * Fix whitespaces * Fix whitespaces

* Reduce instance count to 1 for python bls model loading test * Add comment when calling unload

* Update README and versions for 2.43.0 / 24.02 * Update Dockefile to reduce image size. * Update path in patch file for model generation Update README.md post-24.02

* patching git repository parameterization from production branch 1 * Fix go package directory name * pre-commit fixes * pre-commit fixes --------- Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>

* Enhance bound check for shm offset * Add test for enhance bound check for shm offset * Fix off by 1 on max offset * Improve comments * Improve comment and offset * Separate logic between computation and validation

…6017) * Allow non-decoupled model to send response and FINAL flag separately * Update copyright * Defer sending error until FINAL flag is seen to avoid invalid reference * Move timestamp capture location * Delay time-point of response complete timestamp in GPRC and SageMaker endpoint * Move location of RESPONSE_COMPLETE timestamp capture to better align with the meaning.

Added a test case to check for optional/required input params in a request and appropriate response from server. Includes addition of 3 simple models with a combination of required/optional input params

Add flag to enable compile of OpenAI support in PA

* Test Correlation Id string support for BLS

* Add AsyncIO HTTP compression test * Improve command line option handling

* Update Docerkfile to install genai * Change the installation script * install both build and hatch * Update name --------- Co-authored-by: Elias Bermudez <dbermudez@nvidia.com>

* Added TRITONSERVER_InferenceTraceSetContext logic

…odes (#6992) * Add documentation for mapping between Triton Errors and HTTP status codes * formatting * Update README.md

* Update README and versions for 2.44.0 / 24.03 (#6971) * Update README and versions for 2.44.0 / 24.03 * Mchornyi 24.03 (#6972) * Current location is dropped in 12.4 * Update Dockerfile.win10.min * Change to triton_sample_folder (#6973) --------- Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com> Co-authored-by: Misha Chornyi <99709299+mc-nv@users.noreply.github.com> * Specify path for PyTorch model extension library (#7025) * Update README.md 2.44.0 / 24.03 (#7032) * Update README.md post-24.03 --------- Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com> Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>

* Fix Otel version * Fix version in CPU metrics * Update metrics.md * Update trace.md

…and code (#7067)

docs/conf.py

+    # },
+    "use_edit_page_button": False,
+    "use_issues_button": True,
+    "use_repository_button": True,


deploy/mlflow-triton-plugin/mlflow_triton/config.py


+class Config(dict):


krishung5 and others added 30 commits June 29, 2023 13:50

Add test for Python BLS model loading API (#5980)

438ee53

* Add test for Python BLS model loading API * Fix up

Update README and versions for 23.06 branch

fff1595

Fix LD_LIBRARY_PATH for PyTorch backend

e47fbca

Return updated df in add_cpu_libs

00c0fd1

Remove unneeded df param

3e6ef8d

Update test failure messages to match Dataloader changes (#6006)

b95366e

Add dependency for L0_python_client_unit_tests (#6010)

15dbea1

Improve performance tuning guide (#6026)

20d6bb2

Enabling nested spans for trace mode OpenTelemetry (#5928)

78d9d82

* Adding nested spans to OTel tracing + support of ensemble models

Move multi-GPU dlpack test to a separate L0 test (#6001)

4d864a1

* Move multi-GPU dlpack test to a separate L0 test * Fix copyright * Fix up

OpenVINO 2023.0.0 (#6031)

fd96f23

* Upgrade OV to 2023.0.0 * Upgrade OV model gen script to 2023.0.0

Add test to check the output memory type for onnx models (#6033)

0049763

* Add test to check the output memory type for onnx models * Remove unused import * Address comment

Allow uncompressed conda execution enviroments (#6005)

bfe467c

Add test for uncompressed conda execution enviroments

Fix implicit state test (#6039)

f431477

Adding target_compile_features cxx_std_17 to tracing lib (#6040)

288c1df

Update 'main' to track development of 2.37.0 / 23.08

f17f348

Fix intermittent failure in L0_model_namespacing (#6052)

ee8d048

Fix PyTorch implicit model mounting in gen_qa_model_repository (#6054)

a8f122d

Fix broken links pointing to the grpc_server.cc file (#6068)

1e805ae

Fix L0_backend_python expected instance name (#6073)

899fd2d

* Fix expected instance name * Copyright year

Fix L0_sdk: update the search name for the client wheel (#6074)

00fee98

* Fix name of client wheel to be looked for * Fix up

Fix default instance name change when count is 1 (#6088)

9bc9ad6

Fix default instance name (#6097)

3db04cc

Removing unused tests (#6085)

2b121fd

Update post-23.07 release (#6103)

682cc22

* Update README and versions for 2.36.0 / 23.07 * Update Dockerfile.win10.min * Fix formating issue * fix formating issue * Fix whitespaces * Fix whitespaces * Fix whitespaces

Improve asyncio testing (#6122)

14437dc

Reduce instance count to 1 for python bls model loading test (#6130)

da22fd7

* Reduce instance count to 1 for python bls model loading test * Add comment when calling unload

rmccorm4 and others added 26 commits March 1, 2024 11:56

Add note on --cache-config spacing and fix typos (#6929)

ac5ba42

Remove ignore files that are not in use by repository (#6893)

ad25365

Update README and versions for 2.43.0 / 24.02 (#6886)

3bc6863

* Update README and versions for 2.43.0 / 24.02 * Update Dockefile to reduce image size. * Update path in patch file for model generation Update README.md post-24.02

Set ONNX Runtime version 1.17.2

5ba53d8

Expose tritonserver args in values.yaml (#5582)

46f87ff

Parameterize git repository (#6934)

2255663

* patching git repository parameterization from production branch 1 * Fix go package directory name * pre-commit fixes * pre-commit fixes --------- Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>

Enhance bound check for shm offset (#6914)

60071e1

* Enhance bound check for shm offset * Add test for enhance bound check for shm offset * Fix off by 1 on max offset * Improve comments * Improve comment and offset * Separate logic between computation and validation

Add test for max queue delay timeout prompt response (#6938)

b603024

Test improved input validation errors (#6933)

d89e700

Added a test case to check for optional/required input params in a request and appropriate response from server. Includes addition of 3 simple models with a combination of required/optional input params

Update Dockerfile.sdk with OpenAI support (#6941)

a6ce4f6

Add flag to enable compile of OpenAI support in PA

Test Correlation Id string support for BLS (#6963)

79a09f2

* Test Correlation Id string support for BLS

Update 'main' to track development of 2.45.0 / 24.04 (#6974)

4aba07d

Add AsyncIO HTTP compression test (#6975)

55b13f6

* Add AsyncIO HTTP compression test * Improve command line option handling

Install genai-pa into SDK container (#6942)

30d086d

* Update Docerkfile to install genai * Change the installation script * install both build and hatch * Update name --------- Co-authored-by: Elias Bermudez <dbermudez@nvidia.com>

extend existing tests with more parameters (#6951)

de473d2

Exposing trace context to python backend (#6985)

06b73f3

* Added TRITONSERVER_InferenceTraceSetContext logic

Add documentation for mapping between Triton Errors and HTTP status c…

6972963

…odes (#6992) * Add documentation for mapping between Triton Errors and HTTP status codes * formatting * Update README.md

Remove hatch version (#7009)

8fadf21

Update vLLM to 0.3.2 for gemma support (#6918)

a168d51

Add missing copyright for L0_trace (#6996)

4949fa2

fix sphinx warnings (#7030)

8c3156e

Add meetup invite banner (#7049)

99240f9

Fix incorrect version updates (#7073)

3f83727

* Fix Otel version * Fix version in CPU metrics * Update metrics.md * Update trace.md

Update compose.py and remove mention of tensorflow1 in documentation …

f57de7f

…and code (#7067)

babakbehzad closed this Apr 5, 2024

github-advanced-security bot found potential problems Apr 5, 2024

View reviewed changes

babakbehzad reopened this Apr 5, 2024

babakbehzad merged commit 4018497 into verkada:babak/upgrade-triton-to-v2.44.0 Apr 5, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Babak/upgrade triton to v2.44.0 #5

Babak/upgrade triton to v2.44.0 #5

babakbehzad commented Apr 5, 2024


		class Config(dict):

Babak/upgrade triton to v2.44.0 #5

Babak/upgrade triton to v2.44.0 #5

Conversation

babakbehzad commented Apr 5, 2024