supporting superhot models?

specifically: 
- https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test

i've confirmed that i can load the model in vllm and successfully generate completions but the context length is still only 2048.