Skip to content

Production-ready AI agent using LangGraph and OpenAI for intelligent task management with WhatsApp interface, Google Calendar sync, and automated Cloud Run deployment

Notifications You must be signed in to change notification settings

boemer00/todo-work-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

72 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Task Agent - Production Deployment on Google Cloud Run

CI/CD Pipeline Tests Coverage Python Deployed

An intelligent task management agent deployed on Google Cloud Run with WhatsApp interface, powered by LangGraph and GPT-4o-mini. Features Plan-Execute architecture for complex multi-step requests, natural language date parsing, multi-user support, and Google Calendar integration. Built to demonstrate production-ready Agentic AI engineering skills.

Live Service: https://ai-task-agent-kbimuakj2a-uc.a.run.app


🌐 Live Demo

Try the WhatsApp bot now!

  1. Text: +1 (415) 523-8886
  2. Send: join [your-sandbox-code] (get code from Twilio console)
  3. Try: remind me to buy milk tomorrow at 2pm

Service Status: βœ… Live on Google Cloud Run (us-central1)


⚑ Key Features

  • 🧠 Plan-Execute Architecture - Agent breaks down complex requests into multi-step plans (NEW!)
  • πŸš€ Production Deployment - Fully deployed on Google Cloud Run with HTTPS endpoints
  • πŸ’¬ WhatsApp Interface - Natural conversational UI via Twilio WhatsApp API
  • πŸ”„ Advanced Agent Patterns - ReAct loop with planning, reflection, and state management
  • πŸ—„οΈ Cloud-Native Storage - SQLite databases synced to Cloud Storage
  • 🌍 Multi-User Support - Isolated task lists per user with phone number hashing
  • ⏰ Smart Date Parsing - "tomorrow at 2pm", "next Friday", "in 3 hours"
  • πŸ”’ Production Security - Webhook signature verification, rate limiting (10 msg/min)
  • πŸ“Š Observability - LangSmith tracing for debugging and monitoring

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  WhatsApp   │─────▢│    Cloud Run         │─────▢│ Cloud Storage    β”‚
β”‚  (Twilio)   │◀─────│  FastAPI + LangGraph β”‚      β”‚ (SQLite DBs)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚  GPT-4o-mini β”‚
                       β”‚   + Tools    β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β–Ό                 β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚   Redis   β”‚     β”‚  Google  β”‚
              β”‚ (Limits)  β”‚     β”‚ Calendar β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Flow:

  1. User sends WhatsApp message β†’ Twilio webhook
  2. Cloud Run receives POST β†’ verifies signature β†’ sends ACK
  3. LangGraph agent processes message β†’ calls tools
  4. Tools interact with database/calendar
  5. Response sent back via Twilio Messages API
  6. Databases synced to Cloud Storage on shutdown

πŸ› οΈ Tech Stack

Component Technology Purpose
Agent Framework LangGraph State management, tool orchestration, checkpointing
LLM GPT-4o-mini Natural language understanding, tool selection
Backend FastAPI Async webhook endpoints, background processing
Database SQLite + Cloud Storage Task persistence, conversation memory
Messaging Twilio WhatsApp API User interface, webhook integration
Deployment Google Cloud Run Serverless container hosting, auto-scaling
Rate Limiting Redis Cloud 10 messages/min per user
Observability LangSmith Agent tracing, debugging, performance monitoring
CI/CD GitHub Actions Automated testing (planned)

πŸš€ Quick Start (Local Development)

1. Clone and Install

git clone https://github.com/boemer00/my-agent.git
cd my-agent
pip install -r requirements.txt

2. Set Up Environment Variables

Create .env file:

# Required
OPENAI_API_KEY=your_openai_key_here

# Optional - Observability
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langsmith_key_here
LANGCHAIN_PROJECT=my-todo-agent

# Optional - WhatsApp (for local webhook testing)
TWILIO_ACCOUNT_SID=ACxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_WHATSAPP_NUMBER=whatsapp:+14155238886

3. Run Locally

CLI mode (terminal interface):

python app.py

API mode (WhatsApp webhook):

uvicorn api.main:app --reload --port 8080

Expose local webhook (for Twilio testing):

ngrok http 8080
# Update Twilio webhook to: https://your-ngrok-url.ngrok.io/whatsapp/webhook

πŸ’¬ Example Interactions

Simple Request (Direct Execution)

User: remind me to buy kombucha tomorrow at 2pm
Agent: βœ“ Reminder set: 'buy kombucha' for Thursday, October 31, 2025 at 02:00 PM

User: show my tasks
Agent: Your tasks:
       1. buy kombucha (Due: tomorrow at 2pm)

User: mark 1 as done
Agent: βœ“ Marked task #1 as done: 'buy kombucha'

Complex Request (Plan-Execute Pattern) πŸ†•

User: organize my tasks for this week

Agent: [Internal] Creating plan...
       πŸ“‹ Plan:
       1. List all current tasks
       2. Check which tasks have due dates
       3. Prioritize tasks by deadline
       4. Suggest a schedule for the week

Agent: Let me help organize your week. First, let me see what you have...
       [Executes: list_tasks()]

       I found 5 tasks:
       - Project report (Due: Nov 4, 2pm) πŸ”΄ URGENT
       - Review PRs (Due: Nov 4, 4pm)
       - Buy groceries (Due: Nov 5, 2pm)
       - Call dentist (no deadline)
       - Email team (no deadline)

Agent: [Reflection: Step 1 complete β†’ Moving to step 2]

Agent: Now let me prioritize by urgency...

       Here's your organized week:

       **Monday (Nov 4):**
       - 2pm: Finish project report ⚑
       - 4pm: Review pull requests

       **Tuesday (Nov 5):**
       - 2pm: Buy groceries
       - Evening: Call dentist, email team

       You have 2 urgent tasks today! Would you like me to set reminders?

Key Difference: Complex requests trigger the Plan-Execute pattern, where the agent creates a multi-step plan and systematically works through it with reflection after each step.


πŸ§ͺ Testing

Test Coverage: 121 tests | 70% coverage | <4s runtime

Run Tests

# All tests
pytest

# With coverage
pytest --cov

# Specific categories
pytest tests/test_agent_flows.py    # Integration tests
pytest tests/test_tools.py          # Tool unit tests
pytest tests/test_database.py       # Repository tests
pytest tests/test_date_parser.py    # Date parsing tests

Test Structure

tests/
β”œβ”€β”€ conftest.py           # Shared fixtures, test configuration
β”œβ”€β”€ test_agent_flows.py   # End-to-end agent tests (8 tests)
β”œβ”€β”€ test_database.py      # Database/Repository tests (14 tests)
β”œβ”€β”€ test_date_parser.py   # Date utility tests (13 tests)
└── test_tools.py         # Tool function tests (15 tests)

Key Testing Patterns:

  • βœ… Mocked external APIs (Google Calendar, OpenAI) for fast tests
  • βœ… Isolated test databases (in-memory SQLite)
  • βœ… Time-freezing for predictable date parsing tests
  • βœ… Pytest fixtures for setup/teardown

πŸ—ΊοΈ Roadmap: Phase 2 - Per-User OAuth

Current State: Google Calendar sync works locally with single account Goal: Each WhatsApp user syncs with their own Google Calendar

Why This Matters

Right now, all users would share one Google Calendar (privacy issue). Production needs per-user OAuth where each person authorizes their own calendar.

Implementation Plan

1. OAuth Flow Integration (~2 hours)

  • Add /auth/google endpoint to initiate user authorization
  • Generate unique authorization URLs per user
  • Handle OAuth callback and token exchange
  • Send authorization link via WhatsApp on first reminder

2. Token Storage (~1 hour)

  • Store user tokens in Cloud Storage: gs://bucket/user_tokens/{user_id}_token.json
  • Implement token refresh logic with expiry handling
  • Graceful degradation if user hasn't authorized

3. Secret Management (~1 hour)

  • Move credentials.json to Google Secret Manager
  • Configure Cloud Run to access secrets
  • Remove credentials from container image

4. Calendar Service Updates (~2 hours)

  • Modify get_calendar_service(user_id) to load user-specific tokens
  • Update all calendar functions to accept user_id
  • Add error handling for missing/expired tokens

5. UX Flow (~1 hour)

User: "remind me to call mom tomorrow"
Agent: "To sync with your Google Calendar, please authorize:
       https://ai-task-agent-xxx.run.app/auth/google?user_id=abc123"
[User clicks, authorizes]
Agent: "βœ… Calendar connected! Creating reminder..."

Timeline: 6-8 hours of focused development Benefits: True multi-tenant support, production-ready OAuth, showcase architectural evolution


πŸ“ Project Structure

my-agent/
β”œβ”€β”€ agent/
β”‚   β”œβ”€β”€ graph.py              # LangGraph workflow with Plan-Execute pattern
β”‚   β”œβ”€β”€ nodes.py              # Agent, planner, reflection, tools nodes
β”‚   β”œβ”€β”€ state.py              # State schema (messages, user_id, plan, plan_step)
β”‚   └── prompts.py            # System prompts
β”œβ”€β”€ api/
β”‚   β”œβ”€β”€ main.py               # FastAPI app entry point
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”œβ”€β”€ whatsapp.py       # Webhook endpoints
β”‚   β”‚   └── health.py         # Health check
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   └── message_handler.py # Async message processing
β”‚   └── schemas/
β”‚       └── whatsapp.py       # Pydantic models
β”œβ”€β”€ database/
β”‚   β”œβ”€β”€ models.py             # SQLite schema
β”‚   β”œβ”€β”€ repository.py         # Data access layer
β”‚   └── cloud_storage.py      # GCS sync utilities
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ tasks.py              # Task CRUD tools
β”‚   β”œβ”€β”€ google_calendar.py    # Calendar integration
β”‚   └── __init__.py
β”œβ”€β”€ utils/
β”‚   └── date_parser.py        # Natural language date parsing
β”œβ”€β”€ config/
β”‚   └── settings.py           # Environment config
β”œβ”€β”€ tests/                    # 121 tests, 70% coverage (includes planning tests)
β”œβ”€β”€ docs/                     # Setup guides
β”œβ”€β”€ app.py                    # CLI entry point
β”œβ”€β”€ deploy.sh                 # Cloud Run deployment script
β”œβ”€β”€ Dockerfile                # Multi-stage build
└── requirements.txt          # Dependencies

🎀 Interview Talking Points

Agent Architecture & Advanced Patterns πŸ†•

  • "Explain your Plan-Execute implementation" β†’ Complex requests trigger planner node β†’ LLM creates numbered plan β†’ agent executes step-by-step β†’ reflection node tracks progress β†’ repeats until plan complete. Simple requests bypass planning for efficiency.
  • "Why Plan-Execute over simple ReAct?" β†’ Handles multi-step goals (e.g., "organize my week"), improves task decomposition, shows structured thinking. Demonstrates understanding of advanced agentic patterns beyond basic tool calling.
  • "How does reflection work?" β†’ After each tool execution, reflection node checks: (1) Did we complete current step? (2) Move to next step or finish? (3) Clear plan when done. Keeps agent focused on structured goals.
  • "Show me the agent flow" β†’ START β†’ should_plan() router β†’ [planner OR agent] β†’ agent β†’ tools β†’ should_reflect() router β†’ [reflection OR agent] β†’ loop until END. Conditional routing based on request complexity and plan state.

Architecture & Design

  • "Why LangGraph over pure LLM calls?" β†’ State persistence, checkpointing for conversation memory, built-in tool calling, conditional routing, Plan-Execute pattern support
  • "Explain the ReAct pattern" β†’ Reasoning (LLM thinks) β†’ Acting (execute tools) β†’ Observation (tool results) β†’ repeat until done. Enhanced with planning for complex requests.
  • "How does Cloud Run handle statelessness?" β†’ Databases synced to Cloud Storage on startup/shutdown, ephemeral containers, checkpointer maintains conversation state

Production Considerations

  • "How do you handle Cloud Run cold starts?" β†’ First message gets "Working on it" acknowledgment within 100ms, then full response after agent processing
  • "What's your security model?" β†’ Webhook signature verification (HMAC-SHA1), rate limiting (10/min), API key in env vars, phone number hashing
  • "How would you scale this?" β†’ Horizontal scaling (Cloud Run auto-scales), database connection pooling, async processing, queue for high load

OAuth & Calendar Integration

  • "Why not implement per-user OAuth yet?" β†’ MVP prioritization - focused on core agent + deployment first. Calendar works locally for demos. Phase 2 adds multi-tenant OAuth.
  • "Explain OAuth 2.0 flow" β†’ Authorization code flow: redirect to Google β†’ user consents β†’ callback with code β†’ exchange for tokens β†’ store refresh token
  • "How do you handle token expiry?" β†’ Refresh tokens automatically refresh access tokens when expired, graceful degradation if refresh fails

Technical Decisions

  • "Why SQLite instead of PostgreSQL?" β†’ Simple MVP, <10K users, Cloud Storage sync works well, easy migration path to Cloud SQL later
  • "Why Twilio sandbox vs WhatsApp Business API?" β†’ Faster iteration (5 min setup vs 2 week approval), free for demo, production would use Business API
  • "How do you test agent behavior?" β†’ Mock LLM responses for deterministic tests, integration tests with real LangGraph, LangSmith for production tracing

πŸ“– Additional Documentation


πŸ™ Credits

Built by Renato Boemer as a portfolio project to demonstrate AI engineering skills.

Technologies: LangGraph, LangChain, FastAPI, Google Cloud Run, Twilio, OpenAI


Questions? Check the LangGraph docs or open an issue!

About

Production-ready AI agent using LangGraph and OpenAI for intelligent task management with WhatsApp interface, Google Calendar sync, and automated Cloud Run deployment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •