Closed
Description
It's mentioned many times that profiles are powerful, but… I don't understand how they works.
- How to disable "swap mode" to make sure that no model will be unloaded automatically (let
ttl
do its work)? - What happened if there's "swap mode" is on, the active model is currently processing a request from one user, but another user requested a different model?
// Can't test this case now for technical reasons.