Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inference request cancellation APIs #249

Merged
merged 12 commits into from
Sep 7, 2023

Conversation

Tabrizian
Copy link
Member

@Tabrizian Tabrizian commented Aug 30, 2023

src/infer_request.h Outdated Show resolved Hide resolved
@Tabrizian Tabrizian changed the base branch from main to request-cancellation August 31, 2023 04:55
@Tabrizian Tabrizian marked this pull request as ready for review August 31, 2023 04:55
@nnshah1
Copy link
Contributor

nnshah1 commented Aug 31, 2023

can we add the error code for cancellation into

TRITONSERVER_ERROR_ALREADY_EXISTS
- so that will be the expected error code in responses for cancelled requests?

TRITONSERVER_ERROR_REQUEST_CANCELLED or

TRITONSERVER_ERROR_CANCELLED

@Tabrizian
Copy link
Member Author

@nnshah1 sorry, I missed that. Added cancellation status.

@Tabrizian Tabrizian changed the base branch from request-cancellation to main September 6, 2023 15:48
@Tabrizian Tabrizian changed the base branch from main to request-cancellation September 6, 2023 15:48
src/infer_request.h Outdated Show resolved Hide resolved
@Tabrizian Tabrizian merged commit 6196b75 into request-cancellation Sep 7, 2023
1 check passed
Tabrizian added a commit that referenced this pull request Sep 13, 2023
* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Tabrizian added a commit that referenced this pull request Sep 13, 2023
* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Tabrizian added a commit that referenced this pull request Sep 13, 2023
* Add inference request cancellation APIs (#249)

* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

* Fix request re-use when cancelling a request

* Review edit

* Fix warmup request

* Fix null requests

* Review edit

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
tanmayv25 added a commit that referenced this pull request Oct 4, 2023
* Add inference request cancellation APIs (#249)

* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

* Fix request re-use when cancelling a request (#253)

* Add inference request cancellation APIs (#249)

* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

* Fix request re-use when cancelling a request

* Review edit

* Fix warmup request

* Fix null requests

* Review edit

---------

Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

* Dynamic batch scheduler request cancellation (#257)

* Add adapter for IsCancelled()

* Move cancel functions outside TRITON_ENABLE_STATS

* Add request cancellation to dynamic batch scheduler

* Refactor cancelled requests to use rejected requests routine

* Remove shared pointer wrapper for rejected and cancelled requests

* Ensemble scheduler request cancellation (#263)

* Add adapter for IsCancelled()

* Move cancel functions outside TRITON_ENABLE_STATS

* Add request cancellation to ensemble scheduler

* Sequence batch scheduler request cancellation (#260)

* Add request cancellation to sequence batch scheduler

* Add request cancellation to in-flight sequences

* Refactor on how a request is cancelled

* Fix issue when request is timeout dummy

* Always mark request cancelled before cancelling

* Mark immutable static status as const

---------

Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Jacky <18255193+kthui@users.noreply.github.com>
@Tabrizian Tabrizian deleted the imant-cancellation-api branch February 13, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants