Fall 2025 Workshop on AI for ACM. There are multiple programs in this repo. The first is a basic API for a Gemini LLM. Students will note two major difficulties when using the LLM: 1. The pre-trained model often times gives out-of-date information. 2. Its "memory" is limited to the immediately stated prompt. The others correct for these issues.
Welcome to the Fall 2025 ACM Workshop on AI! Most people use AI tools like ChatGPT or Gemini, but those tools are built on top of a basic LLM. A basic LLM can only respond using what it learned during training — it can’t look up new information or remember past conversations. In this workshop, we’ll improve the basic Gemini LLM by adding two key features: web retrieval for up-to-date answers and FAISS memory so it can remember previous chats.
This repository contains three programs:
- Basic Gemini LLM API – A simple LLM interface.
- Web + Gemini LLM – Adds web scraping to provide up-to-date info.
- Web + Memory + Gemini LLM – Adds a FAISS memory system to remember prior interactions.
You first need to clone the repo in order to have access to all the files on your computer locally:
git clone https://github.com/Arvin385/acm-ai-workshop.git
# Then enter the directory
cd acm-ai-workshopBefore running any code, you need a Gemini API key:
- Go to Google AI Studio.
- Create a project and generate a Gemini API key.
- Store the key as an environment variable:
Linux/macOS:
export GEMINI_API_KEY="your_api_key_here"Windows: (make sure to use quotes around the api key)
$env:GEMINI_API_KEY="your_api_key_here"We recommend using a virtual environment to manage dependencies: (Note: We reccomend using Python 3.12 for this workshop as it has better compatibility)
# If you have version 3.12 by default, run:
python -m venv acm-ai-venv
# RECCOMENDED:
py -3.12 -m venv acm-ai-venv
source acm-ai-venv/bin/activate # On Linux/macOS
acm-ai-venv\Scripts\activate # On WindowsOn Windows, if you encounter an execution policy error, you may have to run the following in Command Prompt:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser If you haven't already, you need to install Python. To check if you have it or not, run:
python --versionIf you get an actual version, you are set. If you get some error about Python not existing, then you need to install it (Python website for Windows, and use Homebrew for Mac). Again, we reccomend installing Python 3.12 for this workshop.
pip install -r requirements.txtRun the simplest LLM API:
python basic_LLM.py-
Type a query and see Gemini respond.
-
Observations:
- The model may give out-of-date information.
- It forgets context beyond the immediate prompt.
Run:
python web_LLM.py- Fetches timely information from the web.
- Customizable parameters:
KEYWORDS = ["stock", "news", "weather", "when", "time", "crypto"]- Add or remove keywords to control when web scraping happens. Otherwise, webscraping will be bypassed.
def fetch_web_info(query, max_links=3, max_paragraphs=5):- Adjust
max_linksormax_paragraphsto balance response quality vs speed.
Run:
python web_and_mem_LLM.py- Stores past interactions in a FAISS vector store for persistent memory.
- Customizable parameters:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)chunk_size: Larger → fewer docs, faster retrieval, less granular memory.chunk_overlap: Helps maintain context across chunks.
model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")- Switch to a larger model for better semantic understanding, or smaller for faster responses on weaker machines.
relevant_docs = faiss.similarity_search(user_input, k=3)- Increase
kto consider more memory context. - Decrease
kfor faster responses and simpler prompts.
llm_prompt = prompt_with_web
if retrieved_text:
llm_prompt += f"\n\nConsider this past info:\n{retrieved_text}"- Customize to prioritize web info vs memory or format prompts differently.
- Add a keyword: Try
"sports"inKEYWORDSand ask Gemini for scores. - Increase scraping depth: Set
max_links=5andmax_paragraphs=10and observe the effect. - Adjust memory granularity: Change
chunk_sizeto 200 andchunk_overlapto 50, then ask about past interactions. - Switch embeddings: Try
"all-MiniLM-L12-v2"for more semantic accuracy.
- Gemini requires an internet connection for API calls.
- FAISS memory is local, but no LLM model needs to run on your machine.
- Adjust keywords, chunk size, and k to balance speed vs context richness.
- When web-retrieving, it is more accurate to send in the user's query to an LLM to determine whether a web search is even required. This would replace the keyword comparison, heavily trading content accuracy over time efficiency.
By the end of this workshop, students will be able to:
- Query a Gemini LLM.
- Fetch real-time web information.
- Build a persistent memory with FAISS.
- Customize prompts, embeddings, memory, and web scraping for personalized AI assistants.