Skip to content

Conversation

@amanfrinati
Copy link
Collaborator

This pull request introduces significant updates to the chatbot's infrastructure and local development environment, focusing on monitoring, observability, and modernization of dependencies. The most notable changes are the addition of a new monitoring module, the upgrade of Langfuse to v3, and the integration of Clickhouse into the ECS stack. Several improvements were also made to Docker Compose configurations, including new services and streamlined scripts.

List of Changes

  • Added a new Terraform monitoring module to apps/infrastructure, including CloudWatch log groups for langfuse_worker and clickhouse, and integrated it into the main stack.
  • Added Clickhouse as a service in the ECS environment and Docker Compose, with appropriate health checks and persistent storage.

Motivation and Context

Mandatory to integrate Langfuse v3

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

  • Chore (nothing changes by a user perspective)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My change requires a change to the documentation.

amanfrinati and others added 24 commits October 6, 2025 18:19
Adds AWS Lambda Runtime Interface Emulator (RIE) to the development Dockerfile for local testing purposes. Updates README to include instructions for creating and filling in the Google service account JSON file.
Added `platform` specifications and restructured services, including new `langfuse-web`, `langfuse-worker`, `clickhouse`, and `minio` setups. Removed unused `networks` references and added health checks for critical services. These changes modernize and enhance the infrastructure to support new functionality and improve service reliability.
Corrected a misspelling in the README subtitle for better clarity. Ensures professionalism and avoids confusion for users following setup instructions.
Removed obsolete variables from .env.example and updated relevant configurations in docker/compose.yaml to ensure consistency. Adjusted ports and Langfuse initialization parameters to align with the updated environment setup.
Moved sensitive and configurable environment variables to `.env` files for improved readability and security. Updated comments in `docker/compose.yaml` to suggest consolidating environment variables into a dedicated `.env.langfuse` file.
# Conflicts:
#	apps/chatbot/docker/compose.yaml
…icies

Introduced a new monitoring module defining resources for ECS tasks, services, ECR repositories, IAM roles, and related security groups. Includes lifecycle policies for ECR to retain 5 recent images and the deployment of Clickhouse with Fargate and EFS integration. Updated main and variables to integrate the module.
# Conflicts:
#	apps/chatbot/.env.example
#	apps/chatbot/docker/compose.yaml
Deleted an obsolete file containing initialization logs for Terraform backend. This file was unnecessary and contributed no functional value to the codebase, improving clarity and reducing repository clutter.
…o-v3' into CAI-302-upgrade-local-langfuse-to-v3

# Conflicts:
#	apps/chatbot/.env.example
#	apps/chatbot/docker/compose.yaml
…opa/developer-portal into CAI-302-upgrade-local-langfuse-to-v3
Deleted an obsolete file containing initialization logs for Terraform backend. This file was unnecessary and contributed no functional value to the codebase, improving clarity and reducing repository clutter.
Removed the `-p chatbot` flag from Docker Compose commands as it was unnecessary for the current setup. This improves maintainability and reduces potential confusion in script execution. No functional changes were introduced.
Uncommented and restored ECS cluster and EFS security group configuration to enable functionality. Switched ClickHouse image source to Docker Hub and added AWS region and VPC variables for deployment consistency. Deprecated ECR references for ClickHouse and adjusted related logic accordingly.
Introduce a new monitoring module to manage infrastructure components. Configure CloudWatch log groups, Secrets Manager integration, and service discovery for Clickhouse in ECS. Prepared resources for secure credentials and DNS namespace for service health checks.
Commented out unused security groups in `security_group.tf` to simplify resource management. Updated ECS environment variables to use local variables for better consistency and maintainability. Removed outdated instructions from `README.md` for improved clarity.
@amanfrinati amanfrinati self-assigned this Oct 28, 2025
@changeset-bot
Copy link

changeset-bot bot commented Oct 28, 2025

🦋 Changeset detected

Latest commit: f5840bf

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
infrastructure Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

amanfrinati and others added 5 commits October 28, 2025 17:28
Moved Langfuse-related environment variables from multiple files into a dedicated `.env.langfuse` for improved organization and maintenance. Updated `.gitignore`, `.env.example`, and Docker compose configurations to reflect the change. This centralization simplifies environment management and enhances clarity across the project.
Reintroduces previously commented-out IAM policy definitions and role policy attachments for ECS task execution and task roles in the Langfuse module. This ensures proper permissions for accessing AWS services such as Secrets Manager, S3, ECR, and Elastic FileSystem.
# Conflicts:
#	apps/chatbot/src/modules/chatbot.py
Reintroduces previously commented-out IAM policy definitions and role policy attachments for ECS task execution and task roles in the Langfuse module. This ensures proper permissions for accessing AWS services such as Secrets Manager, S3, ECR, and Elastic FileSystem.
@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

Jira Pull Request Link

This Pull Request refers to the following Jira issue CAI-644

@github-actions
Copy link
Contributor

This PR exceeds the recommended size of 800 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants