Using web-hosted model for inference

Currently the `NousResearch/Llama-2-7b-chat-hf` model appears to be running locally on my machine, which can take quite a while for long prompts. I'd like to use more AI-optimized hardware to speed this process up.

Is it possible to use a web-hosted version of the model, or use a different web-hosted model entirely?