Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suraj/update triton main #1

Merged
merged 581 commits into from
Dec 15, 2023
Merged

Suraj/update triton main #1

merged 581 commits into from
Dec 15, 2023

Conversation

suraj-vathsa
Copy link

No description provided.

oandreeva-nv and others added 30 commits April 27, 2023 10:52
…riton-inference-server#5696)

* Modify timeout test in L0_sequence_batcher to use portable backend

* Use identity backend that is built by default on Windows
…e-server#5716)

* Use better value in timeout test L0_sequence_batcher

* Format
…#5719)

* Check TRT err msg more granularly

* Clarify source of error messages

* Consolidate tests for message parts
…rence-server#5727)

* updating with pinned versions for python dependencies

* updated with pinned sphinx and nbclient versions
…ence-server#5729)

* Add testing for batcher init failure, add wait for status check

* Formatting

* Change search string
Add fastertransformer test that uses 1GPU.
* Don't use mem probe in Jetson

* Clarify failure messages in L0_backend_python

* Update copyright

* Add JIRA ref, fix _test_jetson
* Add testing for python custom metrics API

* Add custom metrics example to the test

* Fix for CodeQL report

* Fix test name

* Address comment

* Add logger and change the enum usage
* Add HTTP client plugin test

* Add testing for HTTP asyncio

* Add async plugin support

* Fix qa container for L0_grpc

* Add testing for grpc client plugin

* Remove unused imports

* Fix up

* Fix L0_grpc models QA folder

* Update the test based on review feedback

* Remove unused import

* Add testing for .plugin method
* Add --metrics-address, add tests to L0_socket, re-order CLI options for consistency

* Use non-localhost address
…ence-server#5739)

* Add HTTP basic auth test

* Add testing for gRPC basic auth

* Fix up

* Remove unused imports
* Add model instance name update test

* Add gap for timestamp to update

* Add some tests with dynamic batching

* Extend supported test on rate limit off

* Continue test if off mode failed
(1) reduce MAX_ALLOWED_ALLOC to be more strict for bounded tests, and generous for unbounded tests.
(2) allow unstable measurement from PA.
(3) improve logging for future triage
* Add note on --metrics-address

* Copyright
…riton ..." (triton-inference-server#5658)

UnboundLocalError: local variable 'meta_dict' referenced before assignment

The above error shows in listing models in Triton model repository
* Adding test for new sequence mode

* Update option name

* Clean up testing spacing and new lines
…RL (triton-inference-server#5686)

* MLFlow Triton Plugin: Add support for s3 prefix and custom endpoint URL

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

* Update the function order of config.py and use os.path.join to replace filtering a list of strings then joining

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

* Update onnx flavor to support s3 prefix and custom endpoint URL

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

* Fix two typos in MLFlow Triton plugin README.md

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

* Address review comments (replace => strip)

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

* Address review comments (init regex only for s3)

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

* Remove unused local variable: slash_locations

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

---------

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
oandreeva-nv and others added 24 commits November 21, 2023 11:17
…triton-inference-server#6620)

* Extend request objects lifetime

* Remove explicit TRITONSERVER_InferenceRequestDelete

* Format fix

* Include the inference_request_ initialization to cover RequestNew

---------

Co-authored-by: Neelay Shah <neelays@nvidia.com>
…ver#6638)

This fixes the issue where python client has
`AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'
errors after python version is updated.
* Update README and versions for 2.40.0 / 23.11 (triton-inference-server#6544)

* Removing path construction to use SymLink alternatives

* Update version for PyTorch

* Update windows Dockerfile configuration

* Update triton version to 23.11

* Update README and versions for 2.40.0 / 23.11

* Fix typo

* Ading 'ldconfig' to configure dynamic linking in container (triton-inference-server#6602)

* Point to tekit_backend (triton-inference-server#6616)

* Point to tekit_backend

* Update version

* Revert tekit changes (triton-inference-server#6640)

---------

Co-authored-by: Kris Hung <krish@nvidia.com>
* New testing to confirm large request timeout values can be passed and retrieved within Python BLS models.
…rence-server#6663)

* Add test for optional internal tensor within an ensemble

* Fix up
* Set CMake version to 3.27.7

* Set CMake version to 3.27.7

* Fix double slash typo
* Unify iGPU test build with x86 ARM

* adding TRITON_IGPU_BUILD to core build definition; adding logic to skip caffe2plan test if TRITON_IGPU_BUILD=1

* re-organizing some copies in Dockerfile.QA to fix igpu devel build

* Pre-commit fix

---------

Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
…er#6705)

* adding default value for TRITON_IGPU_BUILD=OFF

* fix newline

---------

Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
…-server#6686)

* Add test case for decoupled model raising exception

* Remove unused import

* Address comment
triton-inference-server#6639)

* Add ability to configure GRPC max connection age and max connection age grace
* Allow pass GRPC connection age args when they are set from command
----------
Co-authored-by: Katherine Yang <80359429+jbkyang-nvi@users.noreply.github.com>
# },
"use_edit_page_button": False,
"use_issues_button": True,
"use_repository_button": True,

Check warning

Code scanning / CodeQL

Duplicate key in dict literal Warning documentation

Dictionary key 'use_repository_button' is subsequently
overwritten
.
# Test combinations of BLS and decoupled API in Python backend.
model_name = "decoupled_bls_stream"
in_values = [4, 2, 0, 1]
shape = [1]

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable shape is not used.

class Config(dict):

Check warning

Code scanning / CodeQL

`__eq__` not overridden when adding attributes Warning

The class 'Config' does not override
'__eq__'
, but adds the new attribute
s3_regex
.
@suraj-vathsa suraj-vathsa merged commit 7b98b8b into main Dec 15, 2023
3 checks passed
babakbehzad added a commit that referenced this pull request Dec 19, 2023
suraj-vathsa pushed a commit that referenced this pull request Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.