You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now we have duplicated work in our lib/ and in the python examples. We want to standardize on our lib/ and remove some of the bespoke stuff we have written in python.
This means removing the vLLM specific preprocessing and tokenization and switching to the pre-processor in lib/ and then also using the model deployment cards so we don't need to specify worker args in multiple locations. These args should be read from the MDC.