This project implements a pluggable AI Agent backend built with TypeScript (Node.js) that supports:
- Conversation memory per session
- RAG (Retrieval-Augmented Generation) from local markdown docs
- Plugin system (Weather + Math evaluation)
- LLM response generation using Google Gemini
- Endpoint:
POST /agent/message
Handles user queries with:- Session-based memory
- Context retrieved from documents
- Optional plugin outputs (weather, math)
- Vector Search for RAG: Uses cosine similarity with Gemini embeddings.
- Plugins:
- Weather (OpenWeather API or mock fallback)
- Math expression evaluation
- Prompt Engineering: System prompt includes:
- Agent instructions
- Last 2 messages as memory
- Top 3 relevant context chunks
- Plugin outputs
- Language: TypeScript
- Framework: Express
- LLM: Google Gemini (via
@google/generative-ai) - Embeddings: Gemini
embedding-001 - Vector Similarity:
compute-cosine-similarity - Plugins: Axios for weather API
- Deployment: On Render
ai-agent-server/
│
├── src/
│ ├── index.ts # Server entry point
│ ├── routes/agent.ts # Agent endpoint
│ ├── services/
│ │ ├── llm.ts # Gemini LLM integration
│ │ ├── rag.ts # RAG system with embeddings
│ │ └── plugins.ts # Plugin execution (weather, math)
│ ├── memory.ts # Session-based memory
│
├── documents/ # Markdown docs for RAG
├── .env # Environment variables
├── package.json
├── tsconfig.json
└── README.md
Create a .env file in the project root:
PORT=8080
GEMINI_API_KEY=your_gemini_api_key
OPENWEATHER_API_KEY=your_weather_api_key # optional
npm install
npm run dev
npm run build
npm start
Request Body:
{
"message": "search weather in Bangalore",
"session_id": "123"
}
Response:
{
"response": "The current weather in Bangalore is 28°C and sunny."
}
curl -X POST http://https://ai-agent-server-y85l.onrender.com/agent/message \
-H "Content-Type: application/json" \
-d '{"message":"What is markdown?","session_id":"123"}'
- Receive user input →
/agent/message - Store message in session memory
- Retrieve last 2 messages for context
- Perform RAG:
- Embed query → Compute similarity with document chunks → Get top 3
- Detect and execute plugins
- Build prompt:
- Instructions + Memory + RAG + Plugins
- Call Gemini LLM → Generate response
- Return response & update memory
- Weather:
- Uses OpenWeather API if key exists
- Fallback to mock data
- Math Evaluator:
- Evaluates arithmetic safely with
Function()
- Evaluates arithmetic safely with
- Use Render or Railway:
- Set
NODE_ENV=production - Add
GEMINI_API_KEYin environment variables
- Set
- Start command:
npm run build && npm start
daext-blogging-with-markdown-complete-guide.mdjohn-apostol-custom-markdown-blog.mdjust-files-nextjs-blog-with-react-markdown.mdwebex-boosting-ai-performance-llm-friendly-markdown.mdwikipedia-lightweight-markup-language.md
Vividh Kanaujia