diff --git a/README.md b/README.md index b6b20a2..987baf7 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ ___ - When we host open-source LLMs locally on-premise or in the cloud, the dedicated compute capacity becomes a key issue. While GPU instances may seem the obvious choice, the costs can easily skyrocket beyond budget. - In this project, we will discover how to run quantized versions of open-source LLMs on local CPU inference for document question-and-answer (Q&A).

-![Alt text](assets/document_qa_flowchart.png) +![Alt text](assets/diagram_flow.png) ___ ## Quickstart diff --git a/assets/diagram_flow.png b/assets/diagram_flow.png new file mode 100644 index 0000000..ffaca98 Binary files /dev/null and b/assets/diagram_flow.png differ diff --git a/assets/document_qa_flowchart.png b/assets/document_qa_flowchart.png deleted file mode 100644 index 06bc5d7..0000000 Binary files a/assets/document_qa_flowchart.png and /dev/null differ