For streaming style chat requests inject loading messages / state into the chat stream so there’s no “dead air” when loading.
It may cause issues that llama-swap will have to be more content aware as it will have to suppress things like headers into that the upstream may be sending.
See: #326 (reply in thread)