Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Babak/upgrade triton to v2.44.0 #5

Merged
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jun 29, 2023

  1. Add test for Python BLS model loading API (#5980)

    * Add test for Python BLS model loading API
    
    * Fix up
    krishung5 authored Jun 29, 2023
    Configuration menu
    Copy the full SHA
    438ee53 View commit details
    Browse the repository at this point in the history

Commits on Jun 30, 2023

  1. Configuration menu
    Copy the full SHA
    fff1595 View commit details
    Browse the repository at this point in the history
  2. Fix LD_LIBRARY_PATH for PyTorch backend

    Tabrizian authored and mc-nv committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    e47fbca View commit details
    Browse the repository at this point in the history
  3. Return updated df in add_cpu_libs

    dyastremsky authored and mc-nv committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    00c0fd1 View commit details
    Browse the repository at this point in the history
  4. Remove unneeded df param

    dyastremsky authored and mc-nv committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    3e6ef8d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    b95366e View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2023

  1. Configuration menu
    Copy the full SHA
    15dbea1 View commit details
    Browse the repository at this point in the history

Commits on Jul 6, 2023

  1. Configuration menu
    Copy the full SHA
    20d6bb2 View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2023

  1. Enabling nested spans for trace mode OpenTelemetry (#5928)

    * Adding nested spans to OTel tracing + support of ensemble models
    oandreeva-nv authored Jul 7, 2023
    Configuration menu
    Copy the full SHA
    78d9d82 View commit details
    Browse the repository at this point in the history
  2. Move multi-GPU dlpack test to a separate L0 test (#6001)

    * Move multi-GPU dlpack test to a separate L0 test
    
    * Fix copyright
    
    * Fix up
    Tabrizian authored Jul 7, 2023
    Configuration menu
    Copy the full SHA
    4d864a1 View commit details
    Browse the repository at this point in the history
  3. OpenVINO 2023.0.0 (#6031)

    * Upgrade OV to 2023.0.0
    
    * Upgrade OV model gen script to 2023.0.0
    kthui authored Jul 7, 2023
    Configuration menu
    Copy the full SHA
    fd96f23 View commit details
    Browse the repository at this point in the history
  4. Add test to check the output memory type for onnx models (#6033)

    * Add test to check the output memory type for onnx models
    
    * Remove unused import
    
    * Address comment
    krishung5 authored Jul 7, 2023
    Configuration menu
    Copy the full SHA
    0049763 View commit details
    Browse the repository at this point in the history
  5. Add testing for implicit state for PyTorch backend (#6016)

    * Add testing for implicit state for PyTorch backend
    
    * Add testing for libtorch string implicit models
    
    * Fix CodeQL
    
    * Mention that libtorch backend supports implicit state
    
    * Fix CodeQL
    
    * Review edits
    
    * Fix output tests for PyTorch backend
    Tabrizian authored Jul 7, 2023
    Configuration menu
    Copy the full SHA
    0edd8a7 View commit details
    Browse the repository at this point in the history
  6. Allow uncompressed conda execution enviroments (#6005)

    Add test for uncompressed conda execution enviroments
    jbkyang-nvi authored Jul 7, 2023
    Configuration menu
    Copy the full SHA
    bfe467c View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2023

  1. Configuration menu
    Copy the full SHA
    f431477 View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2023

  1. Configuration menu
    Copy the full SHA
    288c1df View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2023

  1. Configuration menu
    Copy the full SHA
    f17f348 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ee8d048 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a8f122d View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2023

  1. Configuration menu
    Copy the full SHA
    1e805ae View commit details
    Browse the repository at this point in the history

Commits on Jul 18, 2023

  1. Fix L0_backend_python expected instance name (#6073)

    * Fix expected instance name
    
    * Copyright year
    kthui authored Jul 18, 2023
    Configuration menu
    Copy the full SHA
    899fd2d View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2023

  1. Fix L0_sdk: update the search name for the client wheel (#6074)

    * Fix name of client wheel to be looked for
    
    * Fix up
    GuanLuo authored Jul 19, 2023
    Configuration menu
    Copy the full SHA
    00fee98 View commit details
    Browse the repository at this point in the history

Commits on Jul 20, 2023

  1. Add GitHub action to format and lint code (#6022)

    * Add pre-commit
    
    * Fix typos, exec/shebang, formatting
    
    * Remove clang-format
    
    * Update contributing md to include pre-commit
    
    * Update spacing in CONTRIBUTING
    
    * Fix contributing pre-commit link
    
    * Link to pre-commit install directions
    
    * Wording
    
    * Restore clang-format
    
    * Fix yaml spacing
    
    * Exclude templates folder for check-yaml
    
    * Remove unused vars
    
    * Normalize spacing
    
    * Remove unused variable
    
    * Normalize config indentation
    
    * Update .clang-format to enforce max line length of 80
    
    * Update copyrights
    
    * Update copyrights
    
    * Run workflows on every PR
    
    * Fix copyright year
    
    * Fix grammar
    
    * Entrypoint.d files are not executable
    
    * Run pre-commit hooks
    
    * Mark not executable
    
    * Run pre-commit hooks
    
    * Remove unused variable
    
    * Run pre-commit hooks after rebase
    
    * Update copyrights
    
    * Fix README.md typo (decoupled)
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Run pre-commit hooks
    
    * Grammar fix
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Redundant word
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Revert docker file changes
    
    * Executable shebang revert
    
    * Make model.py files non-executable
    
    * Passin is proper flag
    
    * Run pre-commit hooks on init_args/model.py
    
    * Fix typo in init_args/model.py
    
    * Make copyrights one line
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    dyastremsky and rmccorm4 authored Jul 20, 2023
    Configuration menu
    Copy the full SHA
    2847211 View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2023

  1. Configuration menu
    Copy the full SHA
    9bc9ad6 View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2023

  1. Add test for sequence model instance update (#5831)

    * Add test for sequence model instance update
    
    * Add gap for file timestamp update
    
    * Update test for non-blocking sequence update
    
    * Update documentation
    
    * Remove mentioning increase instance count case
    
    * Add more documentaion for scheduler update test
    
    * Update test for non-blocking batcher removal
    
    * Add polling due to async scheduler destruction
    
    * Use _ as private
    
    * Fix typo
    
    * Add docs on instance count decrease
    
    * Fix typo
    
    * Separate direct and oldest to different test cases
    
    * Separate nested tests in a loop into multiple test cases
    
    * Refactor scheduler update test
    
    * Improve doc on handling future test failures
    
    * Address pre-commit
    
    * Add best effort to reset model state after a single test case failure
    
    * Remove reset model method to make harder for chaining multiple test cases as one
    
    * Remove description on model state clean up
    kthui authored Jul 24, 2023
    Configuration menu
    Copy the full SHA
    0f84995 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3db04cc View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2023

  1. Configuration menu
    Copy the full SHA
    2b121fd View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2023

  1. Update post-23.07 release (#6103)

    * Update README and versions for 2.36.0 / 23.07
    
    * Update Dockerfile.win10.min
    
    * Fix formating issue
    
    * fix formating issue
    
    * Fix whitespaces
    
    * Fix whitespaces
    
    * Fix whitespaces
    mc-nv authored Jul 28, 2023
    Configuration menu
    Copy the full SHA
    682cc22 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    14437dc View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2023

  1. Reduce instance count to 1 for python bls model loading test (#6130)

    * Reduce instance count to 1 for python bls model loading test
    
    * Add comment when calling unload
    krishung5 authored Aug 1, 2023
    Configuration menu
    Copy the full SHA
    da22fd7 View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2023

  1. Fix queue test to expect exact number of failures (#6133)

    * Fix queue test to expect exact number of failures
    
    * Increase the execution time to more accurately capture requests
    tanmayv25 authored Aug 4, 2023
    Configuration menu
    Copy the full SHA
    cfe5f24 View commit details
    Browse the repository at this point in the history
  2. Add CPU & GPU metrics in Grafana dashboard.json for K8s op prem deplo…

    …yment (fix #6047) (#6100)
    
    Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
    yeahdongcn authored Aug 4, 2023
    Configuration menu
    Copy the full SHA
    36d80fe View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2023

  1. Adding the support tracing of child models invoked from a BLS model (#…

    …6063)
    
    * Adding tests for bls
    
    * Added fixme, cleaned previous commit
    
    * Removed unused imports
    
    * Fixing commit tree:
    Refactor code, so that OTel tracer provider is initialized only once
    Added resource cmd option, testig
    Added docs
    
    * Clean up
    
    * Update docs/user_guide/trace.md
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Revision
    
    * Update doc
    
    * Clean up
    
    * Added ostream exporter to OpenTelemetry for testing purposes; refactored trace tests
    
    * Added opentelemetry trace collector set up to tests; refactored otel exporter tests to use OTel collector instead of netcat
    
    * Revising according to comments
    
    * Added comment regarding 'parent_span_id'
    
    * Added permalink
    
    * Adjusted test
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    oandreeva-nv and rmccorm4 authored Aug 7, 2023
    Configuration menu
    Copy the full SHA
    c3a2f6b View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2023

  1. Test python environments 3.8-3.11 (#6109)

    Add tests for python 3.8-3.11 for L0_python_backends
    jbkyang-nvi authored Aug 8, 2023
    Configuration menu
    Copy the full SHA
    d964cd2 View commit details
    Browse the repository at this point in the history
  2. Improve L0_backend_python debugging (#6157)

    * Improve L0_backend_python debugging
    
    * Use utils function for artifacts collection
    krishung5 authored Aug 8, 2023
    Configuration menu
    Copy the full SHA
    09409cf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8a8b691 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    82d4110 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2023

  1. Configuration menu
    Copy the full SHA
    0529476 View commit details
    Browse the repository at this point in the history
  2. Update docs with NVAIE messaging (#6162)

    Update docs with NVAIE messaging
    dzier authored Aug 9, 2023
    Configuration menu
    Copy the full SHA
    263dec2 View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2023

  1. Configuration menu
    Copy the full SHA
    b5fb819 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2265bb6 View commit details
    Browse the repository at this point in the history
  3. Remove a test case that sanity checks input value of --shape CLI flag (

    …#6140)
    
    * Remove test checking for --shape option
    
    * Remove the entire test
    nv-hwoo authored Aug 10, 2023
    Configuration menu
    Copy the full SHA
    4a8b716 View commit details
    Browse the repository at this point in the history

Commits on Aug 14, 2023

  1. Add test when unload/load requests for same model is received at the …

    …same time (#6150)
    
    * Add test when unload/load requests for same model received the same time
    
    * Add test_same_model_overlapping_load_unload
    
    * Use a load/unload stress test instead
    
    * Pre-merge test name update
    
    * Address pre-commit error
    
    * Revert "Address pre-commit error"
    
    This reverts commit 781cab1.
    
    * Record number of occurrence of each exception
    kthui authored Aug 14, 2023
    Configuration menu
    Copy the full SHA
    ff1398c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f953f08 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2023

  1. Add end-to-end CI test for decoupled model support (#6131) (#6184)

    * Add end-to-end CI test for decoupled model support
    
    * Address feedback
    matthewkotila authored Aug 15, 2023
    Configuration menu
    Copy the full SHA
    d0599da View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    20d8c1c View commit details
    Browse the repository at this point in the history
  3. added debugging guide (#5924)

    * added debugging guide
    
    * Run pre-commit
    
    ---------
    
    Co-authored-by: David Yastremsky <dyastremsky@nvidia.com>
    tanayvarshney and dyastremsky authored Aug 15, 2023
    Configuration menu
    Copy the full SHA
    59ccf5b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    0f478f3 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2023

  1. Configuration menu
    Copy the full SHA
    81794dd View commit details
    Browse the repository at this point in the history
  2. Fix docs test (#6192)

    matthewkotila authored Aug 16, 2023
    Configuration menu
    Copy the full SHA
    8a823b2 View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2023

  1. Add utility functions for array manipulation (#6203)

    * Add utility functions for outlier removal
    
    * Fix functions
    
    * Add newline to end of file
    dyastremsky authored Aug 17, 2023
    Configuration menu
    Copy the full SHA
    b02e223 View commit details
    Browse the repository at this point in the history
  2. Add gc collect to make sure gpu tensor is deallocated (#6205)

    * Testing: add gc collect to make sure gpu tensor is deallocated
    
    * Address comment
    krishung5 authored Aug 17, 2023
    Configuration menu
    Copy the full SHA
    328614e View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2023

  1. Configuration menu
    Copy the full SHA
    7fd8516 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2023

  1. Configuration menu
    Copy the full SHA
    47db080 View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2023

  1. Trace Model Name Validation (#6199)

    * Initial commit
    
    * Cleanup using new standard formatting
    
    * QA test restructuring
    
    * Add newline to the end of test.sh
    
    * HTTP/GRCP protocol changed to pivot on ready status & error status. Log file name changed in qa test.
    
    * Fixing unhandled error memory leak
    
    * Handle index function memory leak fix
    fpetrini15 authored Aug 23, 2023
    Configuration menu
    Copy the full SHA
    1296c40 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    fbc1f0e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6497518 View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2023

  1. Configuration menu
    Copy the full SHA
    639f758 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2023

  1. Update post-23.08 release (#6234)

    * CUDA 12.1 > 12.2
    
    * DLIS-5208: onnxruntime+windows - stop treat warnings on compile as errors
    
    * Revert "DLIS-5208: onnxruntime+windows - stop treat warnings on compile as errors"
    
    This reverts commit 0cecbb7.
    
    * Update Dockerfile.win10.min
    
    * Update Dockerfile.win10.min
    
    * Update README and versions for 23.08 branch
    
    * Update Dockerfile.win10
    
    * Fix the versions in docs
    
    * Add the note about stabilization of the branch
    
    * Update docs with NVAIE messaging (#6162) (#6167)
    
    Update docs with NVAIE messaging
    
    Co-authored-by: David Zier <42390249+dzier@users.noreply.github.com>
    
    * Resolve merge conflict
    
    ---------
    
    Co-authored-by: tanmayv25 <tanmay2592@gmail.com>
    Co-authored-by: David Zier <42390249+dzier@users.noreply.github.com>
    3 people authored Aug 30, 2023
    Configuration menu
    Copy the full SHA
    7fdff54 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    13db22e View commit details
    Browse the repository at this point in the history
  3. Adding safe string to number conversions (#6173)

    * Added catch for out of range error for trace setting update
    
    * Added wrapper to safe parse options
    
    * Added option names to errors
    
    * Adjustments
    
    * Quick fix
    
    * Fixing option name for Windows
    
    * Removed repetitive code
    
    * Adjust getopt_long for Windows to use longindex
    
    * Moved try catch into ParseOption
    
    * Removed unused input
    
    * Improved names
    
    * Refactoring and clean up
    
    * Fixed Windows
    
    * Refactored getopt_long for Windows
    
    * Refactored trace test, pinned otel's collector version to avoid problems with go requirements
    oandreeva-nv authored Aug 30, 2023
    Configuration menu
    Copy the full SHA
    b2117ab View commit details
    Browse the repository at this point in the history

Commits on Sep 2, 2023

  1. Test Python execute() to return Triton error code (#6228)

    * Add test for Python execute error code
    
    * Add all supported error codes into test
    
    * Move ErrorCode into TritonError
    
    * Expose ErrorCode internal in TritonError
    kthui authored Sep 2, 2023
    Configuration menu
    Copy the full SHA
    7a4ce10 View commit details
    Browse the repository at this point in the history
  2. Add docs on IPv6 (#6262)

    rmccorm4 authored Sep 2, 2023
    Configuration menu
    Copy the full SHA
    7532b76 View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2023

  1. Add test for TensorRT version-compatible model support (#6255)

    * Add tensorrt version-compatibility test
    
    * Generate one version-compatible model
    
    * Fix copyright year
    
    * Remove unnecessary variable
    
    * Remove unnecessary line
    
    * Generate TRT version-compatible model
    
    * Add sample inference to TRT version-compatible test
    
    * Clean up utils and model gen for new plan model
    
    * Fix startswith capitalization
    
    * Remove unused imports
    
    * Remove unused imports
    
    * Add log check
    dyastremsky authored Sep 5, 2023
    Configuration menu
    Copy the full SHA
    39ebf82 View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2023

  1. Configuration menu
    Copy the full SHA
    6a66d60 View commit details
    Browse the repository at this point in the history
  2. Add testing for retrieving shape and datatype in backend API (#6231)

    Add testing for retrieving output shape and datatype info from backend API
    fpetrini15 authored Sep 6, 2023
    Configuration menu
    Copy the full SHA
    6930d00 View commit details
    Browse the repository at this point in the history

Commits on Sep 8, 2023

  1. Configuration menu
    Copy the full SHA
    9fe213a View commit details
    Browse the repository at this point in the history

Commits on Sep 11, 2023

  1. Configuration menu
    Copy the full SHA
    e0dc4aa View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    41c689e View commit details
    Browse the repository at this point in the history
  3. Exclude extra TRT version-compatible models from tests (#6294)

    * Exclude compatible models from tests.
    
    * Force model removal, in case it does not exist
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    dyastremsky and rmccorm4 authored Sep 11, 2023
    Configuration menu
    Copy the full SHA
    199de7b View commit details
    Browse the repository at this point in the history

Commits on Sep 12, 2023

  1. Adding installation of docker and docker-buildx (#6299)

    * Adding installation of docker and docker-buildx
    
    * remove whitespace
    mc-nv authored Sep 12, 2023
    Configuration menu
    Copy the full SHA
    68e116a View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2023

  1. Use targetmodel from header as model name in SageMaker (#6147)

    * Use targetmodel from header as model name in SageMaker
    
    * Update naming for model hash
    nikhil-sk authored Sep 13, 2023
    Configuration menu
    Copy the full SHA
    7624490 View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2023

  1. Configuration menu
    Copy the full SHA
    6e0a9fb View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2023

  1. Fix typo (#6318)

    krishung5 authored Sep 18, 2023
    Configuration menu
    Copy the full SHA
    3de9808 View commit details
    Browse the repository at this point in the history
  2. Update the request re-use example (#6283)

    * Update the request re-use example
    
    * Review edit
    
    * Review comment
    Tabrizian authored Sep 18, 2023
    Configuration menu
    Copy the full SHA
    4a20c28 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2023

  1. Configuration menu
    Copy the full SHA
    6527c73 View commit details
    Browse the repository at this point in the history
  2. Add Python binding build. Add L0_python_api to test Python binding (#…

    …6319)
    
    * Add L0_python_api to test Python binding
    
    * Install Python API in CI image
    
    * Fix QA build
    GuanLuo authored Sep 19, 2023
    Configuration menu
    Copy the full SHA
    40c2c45 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    bbbcad7 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2023

  1. Tests and docs for ability to specify subdirectory to download for Lo…

    …calizePath (#6308)
    
    * Added custom localization tests for s3 and azure, added docs
    oandreeva-nv authored Sep 20, 2023
    Configuration menu
    Copy the full SHA
    5dd9398 View commit details
    Browse the repository at this point in the history

Commits on Sep 21, 2023

  1. Configuration menu
    Copy the full SHA
    2ccc3ce View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2023

  1. Refactor model generation scripts (#6336)

    * Refactor model generation scripts
    
    * Fix codeql
    
    * Fix relative path import
    
    * Fix package structure
    
    * Copy the gen_common file
    
    * Add missing uint8
    
    * Remove duplicate import
    Tabrizian authored Sep 25, 2023
    Configuration menu
    Copy the full SHA
    e0f70aa View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2023

  1. Add testing for scalar I/O in ORT backend (#6343)

    * Add testing for scalar I/O in ORT backend
    
    * Review edit
    
    * ci
    Tabrizian authored Sep 26, 2023
    Configuration menu
    Copy the full SHA
    1f4af6d View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2023

  1. Update post-23.09 release (#6367)

    * Update README and versions for 23.09 branch (#6280)
    
    * Update `Dockerfile` and `build.py`  (#6281)
    
    * Update configuration for Windows Dockerfile (#6256)
    
    * Adding installation of docker and docker-buildx
    
    * Enable '--expt-relaxed-constexpr' flag for custom ops models
    
    * Upate Dockerfile version
    
    * Disable unit tests for Jetson
    
    * Update condition (#6285)
    
    * removing Whitespaces (#6293)
    
    * removing Whitespaces
    
    * removing whitespaces
    mc-nv authored Sep 29, 2023
    Configuration menu
    Copy the full SHA
    f4d4278 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2023

  1. Configuration menu
    Copy the full SHA
    4b4f569 View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2023

  1. Adding client-side request cancellation support and testing (#6383)

    * Add L0_request_cancellation (#6252)
    
    * Add L0_request_cancellation
    
    * Remove unittest test
    
    * Add cancellation to gRPC server error handling
    
    * Fix up
    
    * Use identity model
    
    * Add tests for gRPC client-side cancellation (#6278)
    
    * Add tests for gRPC client-side cancellation
    
    * Fix CodeQL issues
    
    * Formatting
    
    * Update qa/L0_client_cancellation/client_cancellation_test.py
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Move to L0_request_cancellation
    
    * Address review comments
    
    * Removing request cancellation support from asyncio version
    
    * Format
    
    * Update copyright
    
    * Remove tests
    
    * Handle cancellation notification in gRPC server (#6298)
    
    * Handle cancellation notification in gRPC server
    
    * Fix the request ptr initialization
    
    * Update src/grpc/infer_handler.h
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Address review comment
    
    * Fix logs
    
    * Fix request complete callback by removing reference to state
    
    * Improve documentation
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Fixes on the gRPC frontend to handle AsyncNotifyWhenDone() API (#6345)
    
    * Fix segmentation fault in gRPC frontend
    
    * Finalize all states upon completion
    
    * Fixes all state cleanups
    
    * Handle completed states when cancellation notification is received
    
    * Add more documentation steps
    
    * Retrieve dormant states to minimize the memory footprint for long streams
    
    * Update src/grpc/grpc_utils.h
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Use a boolean state instead of raw pointer
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Add L0_grpc_state_cleanup test (#6353)
    
    * Add L0_grpc_state_cleanup test
    
    * Add model file in QA container
    
    * Fix spelling
    
    * Add remaining subtests
    
    * Add failing subtests
    
    * Format fixes
    
    * Fix model repo
    
    * Fix QA docker file
    
    * Remove checks for the error message when shutting down server
    
    * Fix spelling
    
    * Address review comments
    
    * Add schedulers request cancellation tests (#6309)
    
    * Add schedulers request cancellation tests
    
    * Merge gRPC client test
    
    * Reduce testing time and covers cancelling other requests as a consequence of request cancellation
    
    * Add streaming request cancellation test
    
    ---------
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
    4 people authored Oct 4, 2023
    Configuration menu
    Copy the full SHA
    e97ad74 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b95736c View commit details
    Browse the repository at this point in the history

Commits on Oct 6, 2023

  1. Add basic generate endpoints for LLM tasks (#6366)

    * PoC of parsing request prompt and converting to Triton infer request
    
    * Remove extra trace
    
    * Add generate endpoint
    
    * Enable streaming version
    
    * Fix bug
    
    * Fix up
    
    * Add basic testing. Cherry pick from #6369
    
    * format
    
    * Address comment. Fix build
    
    * Minor cleanup
    
    * cleanup syntax
    
    * Wrap error in SSE format
    
    * Fix up
    
    * Restrict number of response on non-streaming generate
    
    * Address comment on implementation.
    
    * Re-enable trace on generate endpoint
    
    * Add more comprehensive llm endpoint tests (#6377)
    
    * Add security policy (#6376)
    
    * Start adding some more comprehensive tests
    
    * Fix test case
    
    * Add response error testing
    
    * Complete test placeholder
    
    * Address comment
    
    * Address comments
    
    * Fix code check
    
    ---------
    
    Co-authored-by: dyastremsky <58150256+dyastremsky@users.noreply.github.com>
    Co-authored-by: GuanLuo <gluo@nvidia.com>
    
    * Address comment
    
    * Address comment
    
    * Address comment
    
    * Fix typo
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    Co-authored-by: dyastremsky <58150256+dyastremsky@users.noreply.github.com>
    3 people authored Oct 6, 2023
    Configuration menu
    Copy the full SHA
    bdf227c View commit details
    Browse the repository at this point in the history
  2. Add Python backend request cancellation test (#6364)

    * Add cancelled response status test
    
    * Add Python backend request cancellation test
    
    * Add Python backend decoupled request cancellation test
    
    * Simplified response if cancelled
    
    * Test response_sender.send() after closed
    
    * Rollback test response_sender.send() after closed
    
    * Rollback non-decoupled any response on cancel
    kthui authored Oct 6, 2023
    Configuration menu
    Copy the full SHA
    2bf543b View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2023

  1. Add TRT-LLM backend build to Triton (#6365) (#6392)

    * Add TRT-LLM backend build to Triton (#6365)
    
    * Add trtllm backend to build
    
    * Temporarily adding version map for 23.07
    
    * Fix build issue
    
    * Update comment
    
    * Comment out python binding changes
    
    * Add post build
    
    * Update trtllm backend naming
    
    * Update TRTLLM base image
    
    * Fix cmake arch
    
    * Revert temp changes for python binding PR
    
    * Address comment
    
    * Move import to the top (#6395)
    
    * Move import to the top
    
    * pre commit format
    krishung5 authored Oct 7, 2023
    Configuration menu
    Copy the full SHA
    a33f257 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2023

  1. Configuration menu
    Copy the full SHA
    2ca380e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    60f14ac View commit details
    Browse the repository at this point in the history

Commits on Oct 9, 2023

  1. Configuration menu
    Copy the full SHA
    b796451 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    135341b View commit details
    Browse the repository at this point in the history

Commits on Oct 11, 2023

  1. Fix L0_sdk (#6387)

    tanmayv25 authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    b1c876d View commit details
    Browse the repository at this point in the history
  2. Add documentation on request cancellation (#6403)

    * Add documentation on request cancellation
    
    * Include python backend
    
    * Update docs/user_guide/request_cancellation.md
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    
    * Update docs/user_guide/request_cancellation.md
    
    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    
    * Update docs/README.md
    
    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    
    * Update docs/user_guide/request_cancellation.md
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Remove inflight term from the main documentation
    
    * Address review comments
    
    * Fix
    
    * Update docs/user_guide/request_cancellation.md
    
    Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
    
    * Fix
    
    ---------
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
    5 people authored Oct 11, 2023
    Configuration menu
    Copy the full SHA
    9c707e3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    85487a1 View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2023

  1. Document generate HTTP endpoint (#6412)

    * Document generate HTTP endpoint
    
    * Address comment
    
    * Fix up
    
    * format
    
    * Address comment
    GuanLuo authored Oct 12, 2023
    Configuration menu
    Copy the full SHA
    786f48f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4929c33 View commit details
    Browse the repository at this point in the history
  3. Fix missing library in L0_data_compression (#6424)

    * Fix missing library in L0_data_compression
    
    * Fix up
    GuanLuo authored Oct 12, 2023
    Configuration menu
    Copy the full SHA
    098de8f View commit details
    Browse the repository at this point in the history

Commits on Oct 13, 2023

  1. Add Javacpp-presets repo location as env variable in Java tests(#6385)

    Simplify testing when upstream (javacpp-presets) build changes. Related to triton-inference-server/client#409
    jbkyang-nvi authored Oct 13, 2023
    Configuration menu
    Copy the full SHA
    6ae3c34 View commit details
    Browse the repository at this point in the history
  2. TRT-LLM backend build changes (#6406)

    * Update url
    
    * Debugging
    
    * Debugging
    
    * Update url
    
    * Fix build for TRT-LLM backend
    
    * Remove TRTLLM TRT and CUDA versions
    
    * Fix up unused var
    
    * Fix up dir name
    
    * FIx cmake patch
    
    * Remove previous TRT version
    
    * Install required packages for example models
    
    * Remove packages that are only needed for testing
    krishung5 authored Oct 13, 2023
    Configuration menu
    Copy the full SHA
    e817431 View commit details
    Browse the repository at this point in the history
  3. Add gRPC AsyncIO request cancellation tests (#6408)

    * Fix gRPC test failure and refactor
    
    * Add gRPC AsyncIO cancellation tests
    
    * Better check if a request is cancelled
    
    * Use f-string
    kthui authored Oct 13, 2023
    Configuration menu
    Copy the full SHA
    95d2ee7 View commit details
    Browse the repository at this point in the history

Commits on Oct 16, 2023

  1. Configuration menu
    Copy the full SHA
    62db791 View commit details
    Browse the repository at this point in the history
  2. Fixing vllm build (#6433)

    * Fixing torch version for vllm
    oandreeva-nv authored Oct 16, 2023
    Configuration menu
    Copy the full SHA
    e11ae14 View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2023

  1. Switch Jetson model TensorRT models generation to container (#6378)

    * Switch Jetson model TensorRT models generation to container
    
    * Adding missed file
    
    * Fix typo
    
    * Fix typos
    
    * Remove extra spaces
    
    * Fix typo
    mc-nv authored Oct 17, 2023
    Configuration menu
    Copy the full SHA
    c8e4dd0 View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2023

  1. Configuration menu
    Copy the full SHA
    960afe7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    328d519 View commit details
    Browse the repository at this point in the history

Commits on Oct 19, 2023

  1. Configuration menu
    Copy the full SHA
    ccdb26b View commit details
    Browse the repository at this point in the history
  2. Fix notify state destruction and inflight states tracking (#6451)

    * Ensure notify_state_ gets properly destructed
    
    * Fix inflight state tracking to properly erase states
    
    * Prevent removing the notify_state from being erased
    
    * Wrap notify_state_ object within unique_ptr
    tanmayv25 authored Oct 19, 2023
    Configuration menu
    Copy the full SHA
    e19cfe7 View commit details
    Browse the repository at this point in the history

Commits on Oct 20, 2023

  1. Update TRT-LLM backend url (#6455)

    * TRTLLM backend post release
    
    * TRTLLM backend post release
    
    * Update submodule url for permission issue
    
    * Update submodule url
    
    * Fix up
    
    * Not using postbuild function to workaround submodule url permission issue
    krishung5 authored Oct 20, 2023
    Configuration menu
    Copy the full SHA
    c162205 View commit details
    Browse the repository at this point in the history
  2. Added docs on python based backends (#6429)

    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    oandreeva-nv and nnshah1 authored Oct 20, 2023
    Configuration menu
    Copy the full SHA
    b0bff04 View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2023

  1. L0_model_config Fix (#6472)

    * Minor fix for L0_model_config
    fpetrini15 authored Oct 24, 2023
    Configuration menu
    Copy the full SHA
    35c54fd View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2b1d267 View commit details
    Browse the repository at this point in the history

Commits on Oct 25, 2023

  1. Test Python BLS with different sizes of CUDA memory pool (#6276)

    * Test with different sizes of CUDA memory pool
    
    * Check the server log for error message
    
    * Improve debugging
    
    * Fix syntax
    krishung5 authored Oct 25, 2023
    Configuration menu
    Copy the full SHA
    b5c2e38 View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2023

  1. Add documentation for K8s-onprem StartupProbe (#5257)

    Co-authored-by: dyastremsky <58150256+dyastremsky@users.noreply.github.com>
    Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
    3 people authored Oct 26, 2023
    Configuration menu
    Copy the full SHA
    3dfa18f View commit details
    Browse the repository at this point in the history

Commits on Oct 27, 2023

  1. Update main post-23.10 release (#6484)

    * Update README and versions for 23.10 branch (#6399)
    
    * Cherry-picking vLLM backend changes (#6404)
    
    * Update build.py to build vLLM backend (#6394)
    
    * Add Python backend when vLLM backend built (#6397)
    
    ---------
    
    Co-authored-by: dyastremsky <58150256+dyastremsky@users.noreply.github.com>
    
    * Add documentation on request cancellation (#6403) (#6407)
    
    * Add documentation on request cancellation
    
    * Include python backend
    
    * Update docs/user_guide/request_cancellation.md
    
    * Update docs/user_guide/request_cancellation.md
    
    * Update docs/README.md
    
    * Update docs/user_guide/request_cancellation.md
    
    * Remove inflight term from the main documentation
    
    * Address review comments
    
    * Fix
    
    * Update docs/user_guide/request_cancellation.md
    
    * Fix
    
    ---------
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
    
    * Fixes in request cancellation doc (#6409) (#6410)
    
    * TRT-LLM backend build changes (#6406) (#6430)
    
    * Update url
    
    * Debugging
    
    * Debugging
    
    * Update url
    
    * Fix build for TRT-LLM backend
    
    * Remove TRTLLM TRT and CUDA versions
    
    * Fix up unused var
    
    * Fix up dir name
    
    * FIx cmake patch
    
    * Remove previous TRT version
    
    * Install required packages for example models
    
    * Remove packages that are only needed for testing
    
    * Fixing vllm build (#6433) (#6437)
    
    * Fixing torch version for vllm
    
    Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>
    
    * Update TRT-LLM backend url (#6455) (#6460)
    
    * TRTLLM backend post release
    
    * TRTLLM backend post release
    
    * Update submodule url for permission issue
    
    * Update submodule url
    
    * Fix up
    
    * Not using postbuild function to workaround submodule url permission issue
    
    * remove redundant lines
    
    * Revert "remove redundant lines"
    
    This reverts commit 86be7ad.
    
    * restore missed lines
    
    * Update build.py
    
    Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>
    
    * Update build.py
    
    Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Tanmay Verma <tanmay2592@gmail.com>
    Co-authored-by: dyastremsky <58150256+dyastremsky@users.noreply.github.com>
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
    Co-authored-by: Kris Hung <krish@nvidia.com>
    Co-authored-by: Katherine Yang <80359429+jbkyang-nvi@users.noreply.github.com>
    Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com>
    10 people authored Oct 27, 2023
    Configuration menu
    Copy the full SHA
    2ba0762 View commit details
    Browse the repository at this point in the history

Commits on Oct 28, 2023

  1. Configuration menu
    Copy the full SHA
    ed7b0fd View commit details
    Browse the repository at this point in the history

Commits on Oct 30, 2023

  1. Improve L0_backend_python test stability (ensemble / gpu_tensor_lifec…

    …ycle) (#6490)
    
    * Test torch allocator gpu memory usage directly rather than global gpu memory for more consistency
    rmccorm4 authored Oct 30, 2023
    Configuration menu
    Copy the full SHA
    638fcf9 View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2023

  1. Add L0_generative_sequence test (#6475)

    * Add testing backend and test
    
    * Add test to build / CI. Minor fix on L0_http
    
    * Format. Update backend documentation
    
    * Fix up
    
    * Address comment
    
    * Add negative testing
    
    * Fix up
    GuanLuo authored Nov 1, 2023
    Configuration menu
    Copy the full SHA
    f49493f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    048d5ac View commit details
    Browse the repository at this point in the history

Commits on Nov 2, 2023

  1. Configuration menu
    Copy the full SHA
    d33eaa5 View commit details
    Browse the repository at this point in the history
  2. Use post build function for TRT-LLM backend (#6476)

    * Use postbuild function
    
    * Remove updating submodule url
    krishung5 authored Nov 2, 2023
    Configuration menu
    Copy the full SHA
    b0d4612 View commit details
    Browse the repository at this point in the history
  3. Enhanced python_backend autocomplete (#6504)

    * Added testing for python_backend autocomplete: optional input and model_transaction_policy
    oandreeva-nv authored Nov 2, 2023
    Configuration menu
    Copy the full SHA
    4985ab9 View commit details
    Browse the repository at this point in the history
  4. Parse reuse-grpc-port and reuse-http-port as booleans (#6511)

    Co-authored-by: Francesco Petrini <francescogpetrini@gmail.com>
    dyastremsky and fpetrini15 authored Nov 2, 2023
    Configuration menu
    Copy the full SHA
    8336873 View commit details
    Browse the repository at this point in the history

Commits on Nov 3, 2023

  1. Fixing L0_io (#6510)

    * Fixing L0_io
    oandreeva-nv authored Nov 3, 2023
    Configuration menu
    Copy the full SHA
    f15ceec View commit details
    Browse the repository at this point in the history
  2. Add Python-based backends CI (#6466)

    * Bumped vllm version
    
    * Add python-bsed backends testing
    
    * Add python-based backends CI
    
    * Fix errors
    
    * Add vllm backend
    
    * Fix pre-commit
    
    * Modify test.sh
    
    * Remove vllm_opt qa model
    
    * Remove vLLM ackend tests
    
    * Resolve review comments
    
    * Fix pre-commit errors
    
    * Update qa/L0_backend_python/python_based_backends/python_based_backends_test.py
    
    Co-authored-by: Tanmay Verma <tanmay2592@gmail.com>
    
    * Remove collect_artifacts_from_subdir function call
    
    ---------
    
    Co-authored-by: oandreeva-nv <oandreeva@nvidia.com>
    Co-authored-by: Tanmay Verma <tanmay2592@gmail.com>
    3 people authored Nov 3, 2023
    Configuration menu
    Copy the full SHA
    c1b334f View commit details
    Browse the repository at this point in the history
  3. Enabling option to restrict access to HTTP APIs based on header value…

    … pairs (similar to gRPC)
    nnshah1 authored Nov 3, 2023
    Configuration menu
    Copy the full SHA
    4b481a6 View commit details
    Browse the repository at this point in the history

Commits on Nov 6, 2023

  1. Configuration menu
    Copy the full SHA
    9da5135 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2023

  1. Configuration menu
    Copy the full SHA
    53b2fac View commit details
    Browse the repository at this point in the history
  2. Test file override outside of model directory (#6516)

    * Add boost-filesystem
    rmccorm4 authored Nov 7, 2023
    Configuration menu
    Copy the full SHA
    d893ca1 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0f1f924 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2023

  1. Configuration menu
    Copy the full SHA
    0664c5a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5fca835 View commit details
    Browse the repository at this point in the history
  3. Enhance testing for pending request count (#6532)

    * Enhance testing for pending request count
    
    * Improve the documentation
    
    * Add more documentation
    tanmayv25 authored Nov 8, 2023
    Configuration menu
    Copy the full SHA
    05dae41 View commit details
    Browse the repository at this point in the history

Commits on Nov 9, 2023

  1. Add testing for Python backend request rescheduling (#6509)

    * Add testing
    
    * Fix up
    
    * Enhance testing
    
    * Fix up
    
    * Revert test changes
    
    * Add grpc endpoint test
    
    * Remove unused import
    
    * Remove unused import
    
    * Update qa/L0_backend_python/request_rescheduling/grpc_endpoint_test.py
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    
    * Update qa/python_models/bls_request_rescheduling/model.py
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    
    ---------
    
    Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
    krishung5 and Tabrizian authored Nov 9, 2023
    Configuration menu
    Copy the full SHA
    380d739 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c295fbf View commit details
    Browse the repository at this point in the history

Commits on Nov 10, 2023

  1. secure deployment considerations guide (#6533)

    * draft document
    
    * updates
    
    * updates
    
    * updated
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * updates
    
    * update
    
    * updates
    
    * updates
    
    * Update docs/customization_guide/deploy.md
    
    Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com>
    
    * Update docs/customization_guide/deploy.md
    
    Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com>
    
    * fixing typos
    
    * updated with clearer warnings
    
    * updates to readme and toc
    
    ---------
    
    Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com>
    nnshah1 and nv-kmcgill53 authored Nov 10, 2023
    Configuration menu
    Copy the full SHA
    53685d3 View commit details
    Browse the repository at this point in the history
  2. Fix typo and change the command line order (#6557)

    * Fix typo and change the command line order
    
    * Improve visual experience. Add 'clang' package
    mc-nv authored Nov 10, 2023
    Configuration menu
    Copy the full SHA
    a59f3cf View commit details
    Browse the repository at this point in the history

Commits on Nov 13, 2023

  1. Configuration menu
    Copy the full SHA
    7982a53 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2023

  1. Configuration menu
    Copy the full SHA
    a5452b4 View commit details
    Browse the repository at this point in the history
  2. Add testing for implicit state enhancements (#6524)

    * Add testing for single buffer
    
    * Add testing for implicit state with buffer growth
    
    * Improve testing
    
    * Fix up
    
    * Add CUDA virtual address size flag
    
    * Add missing test files
    
    * Parameter rename
    
    * Test fixes
    
    * Only build implicit state backend for GPU=ON
    Tabrizian authored Nov 15, 2023
    Configuration menu
    Copy the full SHA
    e5cbe4f View commit details
    Browse the repository at this point in the history
  3. Fix copyright (#6584)

    Tabrizian authored Nov 15, 2023
    Configuration menu
    Copy the full SHA
    ee68c57 View commit details
    Browse the repository at this point in the history

Commits on Nov 16, 2023

  1. Configuration menu
    Copy the full SHA
    441d930 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    859ab91 View commit details
    Browse the repository at this point in the history
  3. Fix L0_sagemaker (#6587)

    rmccorm4 authored Nov 16, 2023
    Configuration menu
    Copy the full SHA
    4102905 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1c20826 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    fb5bc9f View commit details
    Browse the repository at this point in the history

Commits on Nov 19, 2023

  1. Change name generative -> iterative (#6601)

    * name changes
    
    * updated names
    nnshah1 authored Nov 19, 2023
    Configuration menu
    Copy the full SHA
    e7bee37 View commit details
    Browse the repository at this point in the history

Commits on Nov 20, 2023

  1. Add documentation on generative sequence (#6595)

    * Add documentation on generative sequence
    
    * Address comment
    
    * Reflect the "iterative" change
    GuanLuo authored Nov 20, 2023
    Configuration menu
    Copy the full SHA
    daceccf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a5b7001 View commit details
    Browse the repository at this point in the history
  3. Restricted HTTP API documentation

    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    nnshah1 and rmccorm4 authored Nov 20, 2023
    Configuration menu
    Copy the full SHA
    2f24070 View commit details
    Browse the repository at this point in the history

Commits on Nov 21, 2023

  1. Configuration menu
    Copy the full SHA
    ba44923 View commit details
    Browse the repository at this point in the history
  2. Support for http request cancellation. Includes fix for seg fault in …

    …generate_stream endpoint.
    nnshah1 authored Nov 21, 2023
    Configuration menu
    Copy the full SHA
    b876a90 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9647526 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    18ee5ac View commit details
    Browse the repository at this point in the history

Commits on Nov 22, 2023

  1. Configuration menu
    Copy the full SHA
    92214f7 View commit details
    Browse the repository at this point in the history
  2. Update README.md (#6627)

    dzier authored Nov 22, 2023
    Configuration menu
    Copy the full SHA
    738996f View commit details
    Browse the repository at this point in the history
  3. Extend request objects lifetime and fixes possible segmentation fault (

    …#6620)
    
    * Extend request objects lifetime
    
    * Remove explicit TRITONSERVER_InferenceRequestDelete
    
    * Format fix
    
    * Include the inference_request_ initialization to cover RequestNew
    
    ---------
    
    Co-authored-by: Neelay Shah <neelays@nvidia.com>
    tanmayv25 and nnshah1 authored Nov 22, 2023
    Configuration menu
    Copy the full SHA
    4b34a48 View commit details
    Browse the repository at this point in the history

Commits on Nov 29, 2023

  1. Update protobuf after python update for testing (#6638)

    This fixes the issue where python client has
    `AttributeError: 'NoneType' object has no attribute 'enum_types_by_name'
    errors after python version is updated.
    jbkyang-nvi authored Nov 29, 2023
    Configuration menu
    Copy the full SHA
    b96ae5f View commit details
    Browse the repository at this point in the history

Commits on Dec 4, 2023

  1. Update post-23.11 release (#6653)

    * Update README and versions for 2.40.0 / 23.11 (#6544)
    
    * Removing path construction to use SymLink alternatives
    
    * Update version for PyTorch
    
    * Update windows Dockerfile configuration
    
    * Update triton version to 23.11
    
    * Update README and versions for 2.40.0 / 23.11
    
    * Fix typo
    
    * Ading 'ldconfig' to configure dynamic linking in container (#6602)
    
    * Point to tekit_backend (#6616)
    
    * Point to tekit_backend
    
    * Update version
    
    * Revert tekit changes (#6640)
    
    ---------
    
    Co-authored-by: Kris Hung <krish@nvidia.com>
    mc-nv and krishung5 authored Dec 4, 2023
    Configuration menu
    Copy the full SHA
    b9f69b5 View commit details
    Browse the repository at this point in the history
  2. PYBE Timeout Tests (#6483)

    * New testing to confirm large request timeout values can be passed and retrieved within Python BLS models.
    fpetrini15 authored Dec 4, 2023
    Configuration menu
    Copy the full SHA
    b44ee7c View commit details
    Browse the repository at this point in the history

Commits on Dec 5, 2023

  1. Configuration menu
    Copy the full SHA
    4ac7f37 View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2023

  1. Configuration menu
    Copy the full SHA
    817428a View commit details
    Browse the repository at this point in the history

Commits on Dec 8, 2023

  1. Add test for optional internal tensor within an ensemble (#6663)

    * Add test for optional internal tensor within an ensemble
    
    * Fix up
    GuanLuo authored Dec 8, 2023
    Configuration menu
    Copy the full SHA
    8afdad2 View commit details
    Browse the repository at this point in the history

Commits on Dec 10, 2023

  1. Set CMake version to 3.27.7 (#6675)

    * Set CMake version to 3.27.7
    
    * Set CMake version to 3.27.7
    
    * Fix double slash typo
    mc-nv authored Dec 10, 2023
    Configuration menu
    Copy the full SHA
    a34770b View commit details
    Browse the repository at this point in the history

Commits on Dec 11, 2023

  1. restore typo (#6680)

    mc-nv authored Dec 11, 2023
    Configuration menu
    Copy the full SHA
    cbe58e7 View commit details
    Browse the repository at this point in the history

Commits on Dec 12, 2023

  1. Configuration menu
    Copy the full SHA
    f5717c6 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e6c300d View commit details
    Browse the repository at this point in the history

Commits on Dec 13, 2023

  1. Mlflow Plugin Fix (#6685)

    * Mlflow plugin fix
    fpetrini15 authored Dec 13, 2023
    Configuration menu
    Copy the full SHA
    f2cd999 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8165ca7 View commit details
    Browse the repository at this point in the history
  3. Fix iGPU CMakeFile tags (#6695)

    * Unify iGPU test build with x86 ARM
    
    * adding TRITON_IGPU_BUILD to core build definition; adding logic to skip caffe2plan test if TRITON_IGPU_BUILD=1
    
    * re-organizing some copies in Dockerfile.QA to fix igpu devel build
    
    * Pre-commit fix
    
    ---------
    
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    kthui and kyle authored Dec 13, 2023
    Configuration menu
    Copy the full SHA
    0d2e236 View commit details
    Browse the repository at this point in the history

Commits on Dec 14, 2023

  1. adding default value for TRITON_IGPU_BUILD=OFF (#6705)

    * adding default value for TRITON_IGPU_BUILD=OFF
    
    * fix newline
    
    ---------
    
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    nv-kmcgill53 and kyle authored Dec 14, 2023
    Configuration menu
    Copy the full SHA
    9c56e19 View commit details
    Browse the repository at this point in the history

Commits on Dec 15, 2023

  1. Add test case for decoupled model raising exception (#6686)

    * Add test case for decoupled model raising exception
    
    * Remove unused import
    
    * Address comment
    krishung5 authored Dec 15, 2023
    Configuration menu
    Copy the full SHA
    d6bd668 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    13dd37e View commit details
    Browse the repository at this point in the history
  3. vLLM Benchmarking Test (#6631)

    * vLLM Benchmarking Test
    pskiran1 authored Dec 15, 2023
    Configuration menu
    Copy the full SHA
    2df7b25 View commit details
    Browse the repository at this point in the history
  4. Allow configuring GRPC max connection age and max connection age grace (

    #6639)
    
    * Add ability to configure GRPC max connection age and max connection age grace
    * Allow pass GRPC connection age args when they are set from command
    ----------
    Co-authored-by: Katherine Yang <80359429+jbkyang-nvi@users.noreply.github.com>
    siweili11 authored Dec 15, 2023
    Configuration menu
    Copy the full SHA
    23ba7a0 View commit details
    Browse the repository at this point in the history

Commits on Dec 19, 2023

  1. Modify HTTP frontend to return error code reflecting Triton error. Ad…

    …d test (#6713)
    
    * Modify HTTP frontend to return error code reflecting Triton error
    
    * Add test for dedicated HTTP error. Releax existing test on HTTP code
    
    * Address comment. Fix copy right
    GuanLuo authored Dec 19, 2023
    Configuration menu
    Copy the full SHA
    7b7ef51 View commit details
    Browse the repository at this point in the history

Commits on Dec 26, 2023

  1. Update main branch post-23.12 release (#6726)

    * Update README and versions for 23.12 branch
    
    * Bring back the README (#6671)
    
    * Bring back the README
    
    * main -> r23.12
    
    * Remove L0_libtorch_nvfuser (#6674)
    
    * iGPU build refactor (#6684)
    
    * Fix iGPU CMakeFile tags (#6695) (#6698)
    
    * Unify iGPU test build with x86 ARM
    
    * adding TRITON_IGPU_BUILD to core build definition; adding logic to skip caffe2plan test if TRITON_IGPU_BUILD=1
    
    * re-organizing some copies in Dockerfile.QA to fix igpu devel build
    
    * Pre-commit fix
    
    ---------
    
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    
    * Update windows Dockerfile versions (#6672)
    
    Changing version to the latest one
    
    Co-authored-by: Misha Chornyi <mchornyi@nvidia.com>
    
    * Remove README banner (#6719)
    
    * Update README
    
    ---------
    
    Co-authored-by: tanmayv25 <tanmay2592@gmail.com>
    Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    4 people authored Dec 26, 2023
    Configuration menu
    Copy the full SHA
    121194b View commit details
    Browse the repository at this point in the history

Commits on Dec 28, 2023

  1. trtllm CI fixes (#6687)

    * testing apprroach with pre-built image
    
    * Build TensorRT-LLM
    
    * Disable Triton Build
    
    * Remove file
    
    * Update config
    
    * Changet PATH variables
    
    * Update path
    
    * Update configuration for CMake
    
    * Getting back TRITON_BUILD flag
    
    * REvert missing files creation
    
    * Update configuration for the PyTorch installation
    
    * Update configuration for docker
    
    * Change the location
    
    * Update configuration
    
    * update config
    
    * Set CMake version to 3.27.7
    
    * Fix double slash typo
    
    * remove unused strings
    
    * restore typo (#6680)
    
    * remove old line
    
    * fix line indentation
    
    * Update LD_LIBRARY_PATH for TensorRT-LLM
    
    * Addign TRT llm changes
    
    * remove TRT-LLM container from bhte argument list
    
    * Update indentation
    mc-nv authored Dec 28, 2023
    Configuration menu
    Copy the full SHA
    334d3df View commit details
    Browse the repository at this point in the history

Commits on Jan 2, 2024

  1. Configuration menu
    Copy the full SHA
    639106e View commit details
    Browse the repository at this point in the history

Commits on Jan 3, 2024

  1. Update RE2 package location (#6750)

    * Update RE2 package location
    
    * Use only 1 parallel thread for build
    
    * Revert "Use only 1 parallel thread for build"
    
    This reverts commit 93eab3a.
    Tabrizian authored Jan 3, 2024
    Configuration menu
    Copy the full SHA
    854bc42 View commit details
    Browse the repository at this point in the history
  2. Add testing for zero tensors in PyTorch backend (#6760)

    * Add testing for zero tensors in PyTorch backend
    
    * Fix up
    
    * Review edit
    Tabrizian authored Jan 3, 2024
    Configuration menu
    Copy the full SHA
    b42a11a View commit details
    Browse the repository at this point in the history

Commits on Jan 4, 2024

  1. Fix L0_lifecycle on insufficient hardware concurrency (#6762)

    * Do not fail test on insufficient hardware concurrency
    
    * Track instead of fail test if cannot replicate load while async unload
    
    * Add some TODOs for the sub-test
    kthui authored Jan 4, 2024
    Configuration menu
    Copy the full SHA
    9e3bde3 View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2024

  1. Simplify cmake install command (#6725)

    * Simplify cmake install command
    
    * Fix up
    
    * Review comment
    Tabrizian authored Jan 5, 2024
    Configuration menu
    Copy the full SHA
    b922b84 View commit details
    Browse the repository at this point in the history
  2. Add cmdline option to set model load retry. Add test (#6764)

    * Add cmdline option to set model load retry. Add test
    
    * Fix copyright
    
    * Minor change on testing model
    
    * Remove unused import
    GuanLuo authored Jan 5, 2024
    Configuration menu
    Copy the full SHA
    690fe1a View commit details
    Browse the repository at this point in the history

Commits on Jan 8, 2024

  1. Increase timeout (#6774)

    - Extend L0_storage_S3 test timeout
    fpetrini15 authored Jan 8, 2024
    Configuration menu
    Copy the full SHA
    634e5ea View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    bfb6ff0 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2024

  1. Configuration menu
    Copy the full SHA
    0299e63 View commit details
    Browse the repository at this point in the history
  2. Bring back Python backend based PyTorch backend (#6518)

    * Patch L0_model_config with runtime
    
    * Add L0_pytorch_python_runtime
    
    * Update expected runtime field
    
    * Add test for escaping runtime
    
    * Add comments on unit test imports
    
    * Add invalid runtime test
    
    * User to build PyTorch env
    
    * Update copyright
    kthui authored Jan 11, 2024
    Configuration menu
    Copy the full SHA
    4ffec9f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0e278c1 View commit details
    Browse the repository at this point in the history

Commits on Jan 12, 2024

  1. Add tests/docs for Pinned Memory metrics (#6754)

    * Test case
    
    * Update metrics.md
    
    * Fix alert
    
    * Add copyright
    
    * Update test
    
    * Improve pinned_memory_metrics_test.py
    
    * Update qa/L0_metrics/pinned_memory_metrics_test.py
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    * Update pinned_memory_metrics_test.py
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    pskiran1 and rmccorm4 authored Jan 12, 2024
    Configuration menu
    Copy the full SHA
    e58fa4c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b6a6dce View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2024

  1. Support for Context Propagation for OTel trace mode (#6785)

    Added support for OTel context propagation
    
    ---------
    
    Co-authored-by: Markus Hennerbichler <markush@speechmatics.com>
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    3 people authored Jan 17, 2024
    Configuration menu
    Copy the full SHA
    af2c54e View commit details
    Browse the repository at this point in the history
  2. Use current time when overwriting model configuration. (#6727)

    This validates the change made to ../core wrt how model configuration mtime is handled.
    whoisj authored Jan 17, 2024
    Configuration menu
    Copy the full SHA
    7d9f6cd View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2024

  1. Configuration menu
    Copy the full SHA
    6779fc5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e74a353 View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2024

  1. Configuration menu
    Copy the full SHA
    a031177 View commit details
    Browse the repository at this point in the history
  2. Improve L0_backend_python on shm reliability (#6803)

    * Run all cases wihh shm probe
    
    * Warmup test and then run multiple iterations
    
    * Log free shared memory on enter/exit of probe
    
    * Add shm probe to all tests
    
    * Add debug_str to shm_util
    
    * Refactor ensemble_io test, modify probe to check for growth rather than inequality
    
    * Improve stability of bls_tensor_lifecycle gpu memory tests
    
    * Add more visibility into failing model/case in python_unittest helper
    
    * [FIXME] Skip probe on certain subtests for now
    
    * [FIXME] Remove shm probe from test_restart on unhealthy stub
    
    * Start clean server run for each bls test case
    
    * Don't exit early on failure so logs can be properly collected
    
    * Restore bls test logic
    
    * Fix shm size compare
    
    * Print region name that leaked
    
    * Remove special handling on unittest
    
    * Remove debug str
    
    * Add enter and exit delay to shm leak probe
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    kthui and rmccorm4 authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    37cae0c View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2024

  1. Configuration menu
    Copy the full SHA
    d1c8507 View commit details
    Browse the repository at this point in the history

Commits on Jan 23, 2024

  1. Configuration menu
    Copy the full SHA
    2b24225 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. Configuration menu
    Copy the full SHA
    8fc7b10 View commit details
    Browse the repository at this point in the history
  2. Update trace summary script (#6758)

    * Update trace_summery script
    
    * Remove GRPC_WAITREAD and Overhead
    pskiran1 authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    30d64af View commit details
    Browse the repository at this point in the history
  3. Add gsutil upload retry helper function (#6817)

    * Add gsutil cp retry helper function
    
    * Add max retry to GCS upload
    
    * Use simple sequential upload
    kthui authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    4bc15c9 View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2024

  1. Configuration menu
    Copy the full SHA
    b0e7e50 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2024

  1. Handle 0 dimension output for generate endpoint (#6833)

    * Handle empty output
    
    * Add test case for 0 dimension output
    
    * Fix up number of tests
    krishung5 authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    f0d788b View commit details
    Browse the repository at this point in the history
  2. tensorrt-llm benchmarking test (#6771)

    * tensorrt-llm benchmarking test
    pskiran1 authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    62c3a76 View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2024

  1. Update README.md and versions post-24.01 (#6847)

    * Update README and versions for 2.42.0 / 24.01 (#6789)
    
    * Update versions
    
    * Update README and versions for 2.42.0 / 24.01
    
    * Fix documentaation genarion (#6801)
    
    * Ser version of sphix to 5.0
    
    * Set verions 5.0.0
    
    * Update README.md and versions post 24.01
    mc-nv authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    82c5a77 View commit details
    Browse the repository at this point in the history
  2. Use libmamba solver for L0_backend_python env test. Fix pytest not fo…

    …und (#6834)
    
    * Update miniconda version
    
    * Install pytest for different py version
    
    * Install pytest
    krishung5 authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    dfe9dde View commit details
    Browse the repository at this point in the history

Commits on Jan 31, 2024

  1. Add test for shutdown while loading model (#6837)

    * Add test for shutdown while loading
    
    * Fix intermittent failure on test_model_config_overwrite
    kthui authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    f345bbb View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Adding OpenTelemetry Batch Span Processor (#6842)

    Adding OpenTelemetry Batch Span Processor
    ---------
    
    Co-authored-by: Theo Clark <theoclark101@gmail.com>
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    3 people authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    8f98789 View commit details
    Browse the repository at this point in the history
  2. Support Double-Type Inference Request/Response Parameters (#6755)

    * Support Double-Type Infer/Response Parameters
    fpetrini15 authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    9860f73 View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2024

  1. Configuration menu
    Copy the full SHA
    2623c7f View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2024

  1. Python Backend Windows Support (#6830)

    * Base Python Backend Support for Windows
    fpetrini15 authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    bfbb24c View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2024

  1. Configuration menu
    Copy the full SHA
    3dfb7b1 View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2024

  1. Configuration menu
    Copy the full SHA
    4a398b6 View commit details
    Browse the repository at this point in the history

Commits on Feb 14, 2024

  1. Add unit test reports to L0_dlpack_multi_gpu and L0_warmup (#6873)

    * Add unit test reports to L0_dlpack_multi_gpu
    
    * Add unit test reports to L0_warmup
    krishung5 authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    c048a9f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    44cfb62 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fbe03b3 View commit details
    Browse the repository at this point in the history

Commits on Feb 16, 2024

  1. Configuration menu
    Copy the full SHA
    15b7f25 View commit details
    Browse the repository at this point in the history

Commits on Feb 17, 2024

  1. Add response statistics (#6869)

    * Add response statistics
    
    * Add L0_response_statistics
    
    * Enable http vs grpc statistics comparison
    
    * Add docs for response statistics protocol
    
    * Add more comments for response statistics test
    
    * Remove model name from config
    
    * Improve docs wordings
    
    * [Continue] Improve docs wordings
    
    * [Continue] Add more comments for response statistics test
    
    * [Continue 2] Improve docs wordings
    
    * Fix typo
    
    * Remove mentioning decoupled from docs
    
    * [Continue 3] Improve docs wordings
    
    * [Continue 4] Improve docs wordings
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    
    ---------
    
    Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
    kthui and rmccorm4 authored Feb 17, 2024
    Configuration menu
    Copy the full SHA
    d6ab1ef View commit details
    Browse the repository at this point in the history

Commits on Feb 22, 2024

  1. Fix busyop test for L0_memory_growth (#6900)

    * Switch to Python model for busyop test
    
    * Clean up
    
    * Address comment
    
    * Remove unused import
    krishung5 authored Feb 22, 2024
    Configuration menu
    Copy the full SHA
    5732163 View commit details
    Browse the repository at this point in the history

Commits on Feb 23, 2024

  1. Add cancellation into response statistics (#6904)

    * Add cancellation into response statistics
    
    * Add test for response statistics cancel
    
    * Remove debugging print
    
    * Use is None comparison
    
    * Fix docs
    
    * Use default args None
    
    * Refactor RegisterModelStatistics()
    kthui authored Feb 23, 2024
    Configuration menu
    Copy the full SHA
    48cf6b7 View commit details
    Browse the repository at this point in the history

Commits on Feb 24, 2024

  1. Configuration menu
    Copy the full SHA
    d0dd553 View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2024

  1. Match forward headers case insensitively. (#6889)

    * Modify "header_forward_pattern" to match headers case-insensitively. Add unit tests.
    
    * fix indentation
    
    * fix pre-comiit errors
    
    * Update doc
    
    * Update copyright
    
    * Add test case for "(?-i)", which disables regex case-insensitive mode.
    
    * fix pre-commit
    
    * Name each test. Remove support of disabling --http-header-forward-pattern case-insensitive mode on http python client.
    
    * Update .md file.
    
    * fix typo
    
    * Reformat args.
    
    * Fix pre-commit
    
    * Fix test name issue.
    
    * Fix pre-commit.
    
    * Update md file and copyright.
    yinggeh authored Feb 27, 2024
    Configuration menu
    Copy the full SHA
    5630efe View commit details
    Browse the repository at this point in the history

Commits on Mar 1, 2024

  1. Configuration menu
    Copy the full SHA
    ac5ba42 View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2024

  1. Configuration menu
    Copy the full SHA
    ad25365 View commit details
    Browse the repository at this point in the history
  2. Update README and versions for 2.43.0 / 24.02 (#6886)

    * Update README and versions for 2.43.0 / 24.02
    
    * Update Dockefile to reduce image size.
    
    * Update path in patch file for model generation
    
    Update README.md post-24.02
    mc-nv committed Mar 2, 2024
    Configuration menu
    Copy the full SHA
    3bc6863 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5ba53d8 View commit details
    Browse the repository at this point in the history

Commits on Mar 4, 2024

  1. Configuration menu
    Copy the full SHA
    46f87ff View commit details
    Browse the repository at this point in the history

Commits on Mar 6, 2024

  1. Parameterize git repository (#6934)

    * patching git repository parameterization from production branch 1
    
    * Fix go package directory name
    
    * pre-commit fixes
    
    * pre-commit fixes
    
    ---------
    
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    nv-kmcgill53 and kyle authored Mar 6, 2024
    Configuration menu
    Copy the full SHA
    2255663 View commit details
    Browse the repository at this point in the history

Commits on Mar 8, 2024

  1. Enhance bound check for shm offset (#6914)

    * Enhance bound check for shm offset
    
    * Add test for enhance bound check for shm offset
    
    * Fix off by 1 on max offset
    
    * Improve comments
    
    * Improve comment and offset
    
    * Separate logic between computation and validation
    kthui authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    60071e1 View commit details
    Browse the repository at this point in the history
  2. Allow non-decoupled model to send response and FINAL flag separately (#…

    …6017)
    
    * Allow non-decoupled model to send response and FINAL flag separately
    
    * Update copyright
    
    * Defer sending error until FINAL flag is seen to avoid invalid reference
    
    * Move timestamp capture location
    
    * Delay time-point of response complete timestamp in GPRC and SageMaker endpoint
    
    * Move location of RESPONSE_COMPLETE timestamp capture to better align with the meaning.
    GuanLuo authored Mar 8, 2024
    Configuration menu
    Copy the full SHA
    0a8dbaf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b603024 View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2024

  1. Test improved input validation errors (#6933)

    Added a test case to check for optional/required input params in a request and appropriate response from server.
    Includes addition of 3 simple models with a combination of required/optional input params
    indrajit96 authored Mar 9, 2024
    Configuration menu
    Copy the full SHA
    d89e700 View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2024

  1. Update Dockerfile.sdk with OpenAI support (#6941)

    Add flag to enable compile of OpenAI support in PA
    tgerdesnv authored Mar 11, 2024
    Configuration menu
    Copy the full SHA
    a6ce4f6 View commit details
    Browse the repository at this point in the history
  2. Test Correlation Id string support for BLS (#6963)

    * Test Correlation Id string support for BLS
    pskiran1 authored Mar 11, 2024
    Configuration menu
    Copy the full SHA
    79a09f2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4aba07d View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2024

  1. Add AsyncIO HTTP compression test (#6975)

    * Add AsyncIO HTTP compression test
    
    * Improve command line option handling
    kthui authored Mar 13, 2024
    Configuration menu
    Copy the full SHA
    55b13f6 View commit details
    Browse the repository at this point in the history
  2. Install genai-pa into SDK container (#6942)

    * Update Docerkfile to install genai
    
    * Change the installation script
    
    * install both build and hatch
    
    * Update name
    
    ---------
    
    Co-authored-by: Elias Bermudez <dbermudez@nvidia.com>
    mc-nv and debermudez authored Mar 13, 2024
    Configuration menu
    Copy the full SHA
    30d086d View commit details
    Browse the repository at this point in the history

Commits on Mar 15, 2024

  1. Configuration menu
    Copy the full SHA
    de473d2 View commit details
    Browse the repository at this point in the history
  2. Exposing trace context to python backend (#6985)

    * Added TRITONSERVER_InferenceTraceSetContext logic
    oandreeva-nv authored Mar 15, 2024
    Configuration menu
    Copy the full SHA
    06b73f3 View commit details
    Browse the repository at this point in the history

Commits on Mar 19, 2024

  1. Add documentation for mapping between Triton Errors and HTTP status c…

    …odes (#6992)
    
    * Add documentation for mapping between Triton Errors and HTTP status codes
    
    * formatting
    
    * Update README.md
    Tabrizian authored Mar 19, 2024
    Configuration menu
    Copy the full SHA
    6972963 View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2024

  1. Configuration menu
    Copy the full SHA
    8fadf21 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a168d51 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. Configuration menu
    Copy the full SHA
    4949fa2 View commit details
    Browse the repository at this point in the history
  2. fix sphinx warnings (#7030)

    yinggeh authored Mar 25, 2024
    Configuration menu
    Copy the full SHA
    8c3156e View commit details
    Browse the repository at this point in the history

Commits on Mar 27, 2024

  1. Configuration menu
    Copy the full SHA
    99240f9 View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2024

  1. Update 'main' post-24.03 (#7051)

    * Update README and versions for 2.44.0 / 24.03 (#6971)
    
    * Update README and versions for 2.44.0 / 24.03
    
    * Mchornyi 24.03 (#6972)
    
    * Current location is dropped in 12.4
    
    * Update Dockerfile.win10.min
    
    * Change to triton_sample_folder (#6973)
    
    ---------
    
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    Co-authored-by: Misha Chornyi <99709299+mc-nv@users.noreply.github.com>
    
    * Specify path for PyTorch model extension library (#7025)
    
    * Update README.md 2.44.0 / 24.03 (#7032)
    
    * Update README.md post-24.03
    
    ---------
    
    Co-authored-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com>
    Co-authored-by: kyle <kmcgill@kmcgill-ubuntu.nvidia.com>
    3 people authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    6cab4bb View commit details
    Browse the repository at this point in the history

Commits on Apr 4, 2024

  1. Fix incorrect version updates (#7073)

    * Fix Otel version
    
    * Fix version in CPU metrics
    
    * Update metrics.md
    
    * Update trace.md
    Tabrizian authored Apr 4, 2024
    Configuration menu
    Copy the full SHA
    3f83727 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f57de7f View commit details
    Browse the repository at this point in the history