Skip to content

Commit

Permalink
update llm bkc url (#2372)
Browse files Browse the repository at this point in the history
  • Loading branch information
jingxu10 authored Dec 15, 2023
1 parent bf6145a commit 8570219
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions cpu/2.1.100+cpu/tutorials/llm.html
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ <h2>Optimized Models<a class="headerlink" href="#optimized-models" title="Permal
<p>* For Falcon models from remote hub, we need to modify the config.json to use the modeling_falcon.py in transformers. Therefore, in the following scripts, we need to pass an extra configuration file like “–config-file=model_config/tiiuae_falcon-40b_config.json”. This is optional for FP32/BF16 but needed for quantizations.</p>
<p>** For GPT-NEOX/FALCON/OPT models, the accuracy recipes of static quantization INT8 are not ready, thus, they will be skipped in our coverage.</p>
<p><em>Note</em>: The above verified models (including other models in the same model family, like “codellama/CodeLlama-7b-hf” from LLAMA family) are well optimized with all approaches like indirect access KV cache, fused ROPE, and prepacked TPP Linear (fp32/bf16). For other LLM families, we are working in progress to cover those optimizations, which will expand the model list above.</p>
<p>Check <a class="reference external" href="https://github.com/intel/intel-extension-for-pytorch/tree/main/tutorials/../../examples/cpu/inference/python/llm">LLM best known practice</a> for instructions to install/setup environment and example scripts..</p>
<p>Check <a class="reference external" href="https://github.com/intel/intel-extension-for-pytorch/tree/v2.1.100%2Bcpu/examples/cpu/inference/python/llm">LLM best known practice</a> for instructions to install/setup environment and example scripts..</p>
</section>
<section id="demos">
<h2>Demos<a class="headerlink" href="#demos" title="Permalink to this heading"></a></h2>
Expand Down Expand Up @@ -321,4 +321,4 @@ <h3>Distributed Inference<a class="headerlink" href="#distributed-inference" tit
</script>

</body>
</html>
</html>

0 comments on commit 8570219

Please sign in to comment.