An intelligent task management agent deployed on Google Cloud Run with WhatsApp interface, powered by LangGraph and GPT-4o-mini. Features Plan-Execute architecture for complex multi-step requests, natural language date parsing, multi-user support, and Google Calendar integration. Built to demonstrate production-ready Agentic AI engineering skills.
Live Service: https://ai-task-agent-kbimuakj2a-uc.a.run.app
Try the WhatsApp bot now!
- Text: +1 (415) 523-8886
- Send:
join [your-sandbox-code](get code from Twilio console) - Try:
remind me to buy milk tomorrow at 2pm
Service Status: β Live on Google Cloud Run (us-central1)
- π§ Plan-Execute Architecture - Agent breaks down complex requests into multi-step plans (NEW!)
- π Production Deployment - Fully deployed on Google Cloud Run with HTTPS endpoints
- π¬ WhatsApp Interface - Natural conversational UI via Twilio WhatsApp API
- π Advanced Agent Patterns - ReAct loop with planning, reflection, and state management
- ποΈ Cloud-Native Storage - SQLite databases synced to Cloud Storage
- π Multi-User Support - Isolated task lists per user with phone number hashing
- β° Smart Date Parsing - "tomorrow at 2pm", "next Friday", "in 3 hours"
- π Production Security - Webhook signature verification, rate limiting (10 msg/min)
- π Observability - LangSmith tracing for debugging and monitoring
βββββββββββββββ ββββββββββββββββββββββββ ββββββββββββββββββββ
β WhatsApp βββββββΆβ Cloud Run βββββββΆβ Cloud Storage β
β (Twilio) ββββββββ FastAPI + LangGraph β β (SQLite DBs) β
βββββββββββββββ ββββββββββββββββββββββββ ββββββββββββββββββββ
β
βΌ
ββββββββββββββββ
β GPT-4o-mini β
β + Tools β
ββββββββββββββββ
β
ββββββββββ΄βββββββββ
βΌ βΌ
βββββββββββββ ββββββββββββ
β Redis β β Google β
β (Limits) β β Calendar β
βββββββββββββ ββββββββββββ
Data Flow:
- User sends WhatsApp message β Twilio webhook
- Cloud Run receives POST β verifies signature β sends ACK
- LangGraph agent processes message β calls tools
- Tools interact with database/calendar
- Response sent back via Twilio Messages API
- Databases synced to Cloud Storage on shutdown
| Component | Technology | Purpose |
|---|---|---|
| Agent Framework | LangGraph | State management, tool orchestration, checkpointing |
| LLM | GPT-4o-mini | Natural language understanding, tool selection |
| Backend | FastAPI | Async webhook endpoints, background processing |
| Database | SQLite + Cloud Storage | Task persistence, conversation memory |
| Messaging | Twilio WhatsApp API | User interface, webhook integration |
| Deployment | Google Cloud Run | Serverless container hosting, auto-scaling |
| Rate Limiting | Redis Cloud | 10 messages/min per user |
| Observability | LangSmith | Agent tracing, debugging, performance monitoring |
| CI/CD | GitHub Actions | Automated testing (planned) |
git clone https://github.com/boemer00/my-agent.git
cd my-agent
pip install -r requirements.txtCreate .env file:
# Required
OPENAI_API_KEY=your_openai_key_here
# Optional - Observability
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langsmith_key_here
LANGCHAIN_PROJECT=my-todo-agent
# Optional - WhatsApp (for local webhook testing)
TWILIO_ACCOUNT_SID=ACxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_WHATSAPP_NUMBER=whatsapp:+14155238886CLI mode (terminal interface):
python app.pyAPI mode (WhatsApp webhook):
uvicorn api.main:app --reload --port 8080Expose local webhook (for Twilio testing):
ngrok http 8080
# Update Twilio webhook to: https://your-ngrok-url.ngrok.io/whatsapp/webhookUser: remind me to buy kombucha tomorrow at 2pm
Agent: β Reminder set: 'buy kombucha' for Thursday, October 31, 2025 at 02:00 PM
User: show my tasks
Agent: Your tasks:
1. buy kombucha (Due: tomorrow at 2pm)
User: mark 1 as done
Agent: β Marked task #1 as done: 'buy kombucha'
User: organize my tasks for this week
Agent: [Internal] Creating plan...
π Plan:
1. List all current tasks
2. Check which tasks have due dates
3. Prioritize tasks by deadline
4. Suggest a schedule for the week
Agent: Let me help organize your week. First, let me see what you have...
[Executes: list_tasks()]
I found 5 tasks:
- Project report (Due: Nov 4, 2pm) π΄ URGENT
- Review PRs (Due: Nov 4, 4pm)
- Buy groceries (Due: Nov 5, 2pm)
- Call dentist (no deadline)
- Email team (no deadline)
Agent: [Reflection: Step 1 complete β Moving to step 2]
Agent: Now let me prioritize by urgency...
Here's your organized week:
**Monday (Nov 4):**
- 2pm: Finish project report β‘
- 4pm: Review pull requests
**Tuesday (Nov 5):**
- 2pm: Buy groceries
- Evening: Call dentist, email team
You have 2 urgent tasks today! Would you like me to set reminders?
Key Difference: Complex requests trigger the Plan-Execute pattern, where the agent creates a multi-step plan and systematically works through it with reflection after each step.
Test Coverage: 121 tests | 70% coverage | <4s runtime
# All tests
pytest
# With coverage
pytest --cov
# Specific categories
pytest tests/test_agent_flows.py # Integration tests
pytest tests/test_tools.py # Tool unit tests
pytest tests/test_database.py # Repository tests
pytest tests/test_date_parser.py # Date parsing teststests/
βββ conftest.py # Shared fixtures, test configuration
βββ test_agent_flows.py # End-to-end agent tests (8 tests)
βββ test_database.py # Database/Repository tests (14 tests)
βββ test_date_parser.py # Date utility tests (13 tests)
βββ test_tools.py # Tool function tests (15 tests)
Key Testing Patterns:
- β Mocked external APIs (Google Calendar, OpenAI) for fast tests
- β Isolated test databases (in-memory SQLite)
- β Time-freezing for predictable date parsing tests
- β Pytest fixtures for setup/teardown
Current State: Google Calendar sync works locally with single account Goal: Each WhatsApp user syncs with their own Google Calendar
Right now, all users would share one Google Calendar (privacy issue). Production needs per-user OAuth where each person authorizes their own calendar.
1. OAuth Flow Integration (~2 hours)
- Add
/auth/googleendpoint to initiate user authorization - Generate unique authorization URLs per user
- Handle OAuth callback and token exchange
- Send authorization link via WhatsApp on first reminder
2. Token Storage (~1 hour)
- Store user tokens in Cloud Storage:
gs://bucket/user_tokens/{user_id}_token.json - Implement token refresh logic with expiry handling
- Graceful degradation if user hasn't authorized
3. Secret Management (~1 hour)
- Move
credentials.jsonto Google Secret Manager - Configure Cloud Run to access secrets
- Remove credentials from container image
4. Calendar Service Updates (~2 hours)
- Modify
get_calendar_service(user_id)to load user-specific tokens - Update all calendar functions to accept
user_id - Add error handling for missing/expired tokens
5. UX Flow (~1 hour)
User: "remind me to call mom tomorrow"
Agent: "To sync with your Google Calendar, please authorize:
https://ai-task-agent-xxx.run.app/auth/google?user_id=abc123"
[User clicks, authorizes]
Agent: "β
Calendar connected! Creating reminder..."
Timeline: 6-8 hours of focused development Benefits: True multi-tenant support, production-ready OAuth, showcase architectural evolution
my-agent/
βββ agent/
β βββ graph.py # LangGraph workflow with Plan-Execute pattern
β βββ nodes.py # Agent, planner, reflection, tools nodes
β βββ state.py # State schema (messages, user_id, plan, plan_step)
β βββ prompts.py # System prompts
βββ api/
β βββ main.py # FastAPI app entry point
β βββ routes/
β β βββ whatsapp.py # Webhook endpoints
β β βββ health.py # Health check
β βββ services/
β β βββ message_handler.py # Async message processing
β βββ schemas/
β βββ whatsapp.py # Pydantic models
βββ database/
β βββ models.py # SQLite schema
β βββ repository.py # Data access layer
β βββ cloud_storage.py # GCS sync utilities
βββ tools/
β βββ tasks.py # Task CRUD tools
β βββ google_calendar.py # Calendar integration
β βββ __init__.py
βββ utils/
β βββ date_parser.py # Natural language date parsing
βββ config/
β βββ settings.py # Environment config
βββ tests/ # 121 tests, 70% coverage (includes planning tests)
βββ docs/ # Setup guides
βββ app.py # CLI entry point
βββ deploy.sh # Cloud Run deployment script
βββ Dockerfile # Multi-stage build
βββ requirements.txt # Dependencies
Agent Architecture & Advanced Patterns π
- "Explain your Plan-Execute implementation" β Complex requests trigger planner node β LLM creates numbered plan β agent executes step-by-step β reflection node tracks progress β repeats until plan complete. Simple requests bypass planning for efficiency.
- "Why Plan-Execute over simple ReAct?" β Handles multi-step goals (e.g., "organize my week"), improves task decomposition, shows structured thinking. Demonstrates understanding of advanced agentic patterns beyond basic tool calling.
- "How does reflection work?" β After each tool execution, reflection node checks: (1) Did we complete current step? (2) Move to next step or finish? (3) Clear plan when done. Keeps agent focused on structured goals.
- "Show me the agent flow" β START β should_plan() router β [planner OR agent] β agent β tools β should_reflect() router β [reflection OR agent] β loop until END. Conditional routing based on request complexity and plan state.
Architecture & Design
- "Why LangGraph over pure LLM calls?" β State persistence, checkpointing for conversation memory, built-in tool calling, conditional routing, Plan-Execute pattern support
- "Explain the ReAct pattern" β Reasoning (LLM thinks) β Acting (execute tools) β Observation (tool results) β repeat until done. Enhanced with planning for complex requests.
- "How does Cloud Run handle statelessness?" β Databases synced to Cloud Storage on startup/shutdown, ephemeral containers, checkpointer maintains conversation state
Production Considerations
- "How do you handle Cloud Run cold starts?" β First message gets "Working on it" acknowledgment within 100ms, then full response after agent processing
- "What's your security model?" β Webhook signature verification (HMAC-SHA1), rate limiting (10/min), API key in env vars, phone number hashing
- "How would you scale this?" β Horizontal scaling (Cloud Run auto-scales), database connection pooling, async processing, queue for high load
OAuth & Calendar Integration
- "Why not implement per-user OAuth yet?" β MVP prioritization - focused on core agent + deployment first. Calendar works locally for demos. Phase 2 adds multi-tenant OAuth.
- "Explain OAuth 2.0 flow" β Authorization code flow: redirect to Google β user consents β callback with code β exchange for tokens β store refresh token
- "How do you handle token expiry?" β Refresh tokens automatically refresh access tokens when expired, graceful degradation if refresh fails
Technical Decisions
- "Why SQLite instead of PostgreSQL?" β Simple MVP, <10K users, Cloud Storage sync works well, easy migration path to Cloud SQL later
- "Why Twilio sandbox vs WhatsApp Business API?" β Faster iteration (5 min setup vs 2 week approval), free for demo, production would use Business API
- "How do you test agent behavior?" β Mock LLM responses for deterministic tests, integration tests with real LangGraph, LangSmith for production tracing
- Google Calendar Setup - OAuth 2.0 configuration guide
- Deployment Guide - Step-by-step Cloud Run deployment (if exists)
- Monitoring Guide - LangSmith setup and best practices
Built by Renato Boemer as a portfolio project to demonstrate AI engineering skills.
- GitHub: @boemer00
- LinkedIn: Renato Boemer
Technologies: LangGraph, LangChain, FastAPI, Google Cloud Run, Twilio, OpenAI
Questions? Check the LangGraph docs or open an issue!