Skip to content

Commit 87ca1b1

Browse files
authored
Create README.md
1 parent 451c4a9 commit 87ca1b1

File tree

1 file changed

+92
-0
lines changed

1 file changed

+92
-0
lines changed
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# 🚀 SimpleChatWithDownload
2+
3+
Experience the power of local inference! This app runs a Large Language Model (LLM) entirely on your machine, meaning no internet or external API calls are needed for predictions. By leveraging GPU (on Mac) or CPU (on Windows) for computation, you get a secure and self-contained AI experience tailored to your hardware setup. 🎉
4+
5+
**SimpleChatWithDownload** is an exciting sample project from the **llama-cpp-delphi** bindings. This app provides a streamlined way to interact with a local LLM (Large Language Model) in a sleek chat interface, featuring automatic model downloads. Whether you’re using Mac Silicon for blazing-fast GPU inference or Windows for a **SLOW** CPU inference, this sample is a great way to get started! 🎉
6+
7+
## 🌟 Features
8+
9+
- **Interactive Chat Window**: Start chatting with your local LLM in seconds!
10+
- **Automatic Model Downloads**: Download models like **Llama-2**, **Llama-3**, and **Mistral Lite** effortlessly. 🚀
11+
- Models are cloned via Git and downloaded to your system’s default download folder.
12+
- **Platform Support**:
13+
- 🖥️ **Mac Silicon**: GPU (MPS) and CPU inference supported.
14+
- 💻 **Windows**: CPU inference only. Feel free to extend it and test CUDA.
15+
- ⚡ GPU inference is recommended for Mac to avoid slower CPU performance.
16+
- **Pre-Bundled Llama.cpp Libraries**: No extra setup! All required libraries are included in the `lib` folder for easy deployment.
17+
- **Customizable Settings**:
18+
- Choose your model.
19+
- Switch between GPU and CPU inference on Mac.
20+
- Enable/disable seed settings to control response variability.
21+
22+
## 🛠️ Getting Started
23+
24+
### Note
25+
26+
You must have Git installed on your machine to clone model repositories.
27+
28+
### Prerequisites
29+
30+
1. Ensure you have the **llama-cpp-delphi** project ready. If not, grab it from the repository.
31+
2. A **Delphi IDE** installation.
32+
3. For Mac deployment, make sure **PAServer** is running on your Mac.
33+
34+
### Steps to Run
35+
36+
1. **Build llama-cpp-delphi**:
37+
- Open the llama-cpp-delphi project in Delphi IDE.
38+
- Build it for **Windows** and **Mac Silicon**.
39+
40+
2. **Open and Build the Sample**:
41+
- Open the `SimpleChatWithDownload` sample in Delphi IDE.
42+
- Build it for your target platform:
43+
- **Mac Silicon**: Recommended for GPU inference.
44+
- **Windows**: CPU inference only.
45+
46+
3. **Deploy to Mac**:
47+
- Connect to your Mac using **PAServer**.
48+
- Deploy the app to your Mac. 🎉
49+
50+
4. **Run the App**:
51+
- The app will launch with a "Settings" menu where you can:
52+
- Select your model (Llama-2, Llama-3, Mistral Lite).
53+
- Choose GPU or CPU inference (Mac only).
54+
- Enable/disable seed randomness.
55+
56+
### Download and Use Models
57+
58+
- Click the **hamburger menu** to start downloading the selected model.
59+
- Supported Models:
60+
- **Llama-2**: ~7 GB (7B.Q4_K_M).
61+
- **Llama-3**: ~5 GB (30B.Q4_K_M).
62+
- **Mistral Lite**: ~7 GB (7B.Q4_K_M).
63+
- 🔧 You can also use any GGUF-compatible models with Llama.cpp.
64+
- 💡 Feel free to test **DeepSeek** locally for additional insights and functionality!
65+
66+
- After the model download is complete, the chat window will activate.
67+
68+
## 💡 Usage Tips
69+
70+
- **Start Chatting**:
71+
- Type your message in the chat box and press **Enter** or click the **Play** button.
72+
- Use the **Stop** button to pause responses.
73+
74+
- **Customize Inference**:
75+
- Mac users: Switch between GPU (fast) and CPU (fallback) modes via the "Settings" menu.
76+
- Windows users: For better performance, explore CUDA builds in the llama-cpp-delphi "Release" section. 💪
77+
78+
- **Seed Option**:
79+
- Prevent repetitive responses for the same questions by enabling the seed setting.
80+
81+
## 📁 Libraries
82+
83+
All required libraries are bundled in the `lib` folder of the sample’s root directory:
84+
- **Mac**: Deployment is pre-configured. Deploy via PAServer, and you’re good to go!
85+
- **Windows**: The app automatically loads libraries from the `lib` folder.
86+
87+
For additional builds (e.g., CUDA versions), visit the llama-cpp-delphi "Release" section.
88+
89+
## 🌟 Final Notes
90+
91+
Enjoy chatting with cutting-edge LLMs in your own app! If you run into any issues or have feedback, feel free to contribute or reach out. Happy coding! 🚀
92+

0 commit comments

Comments
 (0)