triton-inference-server
diff --git a/‎AI_Agents_Guide/README.md‎
Lines changed: 0 additions & 62 deletions b/‎AI_Agents_Guide/README.md‎
Lines changed: 0 additions & 62 deletions
diff --git a/‎AI_Agents_Guide/Constrained_Decoding/README.md‎ renamed to ‎Feature_Guide/Constrained_Decoding/README.md‎ b/‎AI_Agents_Guide/Constrained_Decoding/README.md‎ renamed to ‎Feature_Guide/Constrained_Decoding/README.md‎
diff --git a/‎AI_Agents_Guide/Constrained_Decoding/artifacts/client.py‎ renamed to ‎Feature_Guide/Constrained_Decoding/artifacts/client.py‎ b/‎AI_Agents_Guide/Constrained_Decoding/artifacts/client.py‎ renamed to ‎Feature_Guide/Constrained_Decoding/artifacts/client.py‎
diff --git a/‎AI_Agents_Guide/Constrained_Decoding/artifacts/client_utils.py‎ renamed to ‎Feature_Guide/Constrained_Decoding/artifacts/client_utils.py‎ b/‎AI_Agents_Guide/Constrained_Decoding/artifacts/client_utils.py‎ renamed to ‎Feature_Guide/Constrained_Decoding/artifacts/client_utils.py‎
diff --git a/‎AI_Agents_Guide/Constrained_Decoding/artifacts/utils.py‎ renamed to ‎Feature_Guide/Constrained_Decoding/artifacts/utils.py‎ b/‎AI_Agents_Guide/Constrained_Decoding/artifacts/utils.py‎ renamed to ‎Feature_Guide/Constrained_Decoding/artifacts/utils.py‎
diff --git a/‎AI_Agents_Guide/Function_Calling/README.md‎ renamed to ‎Feature_Guide/Function_Calling/README.md‎ b/‎AI_Agents_Guide/Function_Calling/README.md‎ renamed to ‎Feature_Guide/Function_Calling/README.md‎
diff --git a/‎AI_Agents_Guide/Function_Calling/artifacts/client.py‎ renamed to ‎Feature_Guide/Function_Calling/artifacts/client.py‎ b/‎AI_Agents_Guide/Function_Calling/artifacts/client.py‎ renamed to ‎Feature_Guide/Function_Calling/artifacts/client.py‎
diff --git a/‎AI_Agents_Guide/Function_Calling/artifacts/client_utils.py‎ renamed to ‎Feature_Guide/Function_Calling/artifacts/client_utils.py‎ b/‎AI_Agents_Guide/Function_Calling/artifacts/client_utils.py‎ renamed to ‎Feature_Guide/Function_Calling/artifacts/client_utils.py‎
diff --git a/‎AI_Agents_Guide/Function_Calling/artifacts/system_prompt_schema.yml‎ renamed to ‎Feature_Guide/Function_Calling/artifacts/system_prompt_schema.yml‎ b/‎AI_Agents_Guide/Function_Calling/artifacts/system_prompt_schema.yml‎ renamed to ‎Feature_Guide/Function_Calling/artifacts/system_prompt_schema.yml‎
diff --git a/‎Feature_Guide/Speculative_Decoding/README.md‎
Lines changed: 3 additions & 1 deletion b/‎Feature_Guide/Speculative_Decoding/README.md‎
Lines changed: 3 additions & 1 deletion
@@ -54,4 +54,6 @@ may prove simpler than generating a summary for an article. [Spec-Bench](https:/
 shows the performance of different speculative decoding approaches on different tasks.
 
 ## Speculative Decoding with Triton Inference Server
-Follow [here](TRT-LLM/README.md) to learn how Triton Inference Server supports speculative decoding with [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM).
+ Triton Inference Server supports speculative decoding on different types of Triton backends. See what a Triton backend is [here](https://github.com/triton-inference-server/backend).
+- Follow [here](TRT-LLM/README.md) to learn how Triton Inference Server supports speculative decoding with [TensorRT-LLM Backend](https://github.com/triton-inference-server/tensorrtllm_backend).
+- Follow [here](vLLM/README.md) to learn how Triton Inference Server supports speculative decoding with [vLLM Backend](https://github.com/triton-inference-server/vllm_backend).