diff --git a/README.md b/README.md
index b6b20a2..987baf7 100644
--- a/README.md
+++ b/README.md
@@ -11,7 +11,7 @@ ___
 - When we host open-source LLMs locally on-premise or in the cloud, the dedicated compute capacity becomes a key issue. While GPU instances may seem the obvious choice, the costs can easily skyrocket beyond budget.
 - In this project, we will discover how to run quantized versions of open-source LLMs on local CPU inference for document question-and-answer (Q&A).
 <br><br>
-![Alt text](assets/document_qa_flowchart.png)
+![Alt text](assets/diagram_flow.png)
 ___
 
 ## Quickstart
diff --git a/assets/diagram_flow.png b/assets/diagram_flow.png
new file mode 100644
index 0000000..ffaca98
Binary files /dev/null and b/assets/diagram_flow.png differ
diff --git a/assets/document_qa_flowchart.png b/assets/document_qa_flowchart.png
deleted file mode 100644
index 06bc5d7..0000000
Binary files a/assets/document_qa_flowchart.png and /dev/null differ