Update blog

cmdr2 · cmdr2 · commit 7df337c52cef · 2025-11-03T16:43:02.000+05:30
diff --git a/content/blog/2025-10-27-1761560082.md b/content/blog/2025-10-27-1761560082.md
@@ -8,7 +8,7 @@ tags:
   - sdkit
 ---
 
-As a note to myself, a possible intuition for understanding GPU memory hierarchy (and the performance penalty for data transfer between various layers):
+As a note to myself, a possible intuition for understanding GPU memory hierarchy (and the performance penalty for data transfer between various layers) is to think of it like a manufacturing logistics problem:
 1. CPU (host) to GPU (device) is like travelling overnight between two cities. The CPU city is like the "headquarters", and contains a mega-sized warehouse of parts (think football field sizes), also known as 'Host memory'.
 2. Each GPU is like a different city, containing its own warehouse outside the city, also known as 'Global Memory'. This warehouse stockpiles whatever it needs from the headquarters city (CPU).
 3. Each SM/Core/Tile is a factory located in different areas of the city. Each factory contains a small warehouse (shed) for stockpiling whatever inventory it needs, also known as 'Shared Memory'.