Skip to content

Conversation

@rootflo-hardik
Copy link
Contributor

@rootflo-hardik rootflo-hardik commented Dec 31, 2025

  • removed other bucket env variables and references

Summary by CodeRabbit

  • Documentation

    • Simplified Docker setup to describe a single unified application storage bucket.
  • Chores

    • Replaced many per-service/per-cloud bucket environment variables with a single application storage bucket across cloud configs and compose files.
  • Refactor

    • Standardized runtime configuration to use a consolidated storage section (application bucket) throughout the codebase.
  • Tests

    • Updated test configs and fixtures to reference the unified application storage bucket.

✏️ Tip: You can customize this high-level summary in your review settings.

- removed other bucket env variables and references
@coderabbitai
Copy link

coderabbitai bot commented Dec 31, 2025

📝 Walkthrough

Walkthrough

Consolidates per-cloud and per-feature bucket settings into a single storage.application_bucket (APPLICATION_BUCKET), updating docs, docker-compose, config files, service initializations, and tests to use the unified bucket key.

Changes

Cohort / File(s) Summary
Documentation & Compose
DOCKER_SETUP.md, docker-compose.sample.yml
Replaced many bucket env vars with single APPLICATION_BUCKET; updated bucket creation steps and related descriptions.
App & Job Configs
wavefront/server/apps/floware/floware/config.ini, wavefront/server/apps/inference_app/inference_app/config.ini, wavefront/server/background_jobs/workflow_job/workflow_job/config.ini
Removed provider-specific bucket keys; added [storage] with application_bucket=${APPLICATION_BUCKET}; removed now-unused bucket sections/keys.
Service Containers
wavefront/server/modules/agents_module/.../agents_container.py, wavefront/server/modules/gold_module/.../gold_container.py, wavefront/server/modules/insights_module/.../insights_container.py, wavefront/server/modules/plugins_module/.../plugins_container.py, wavefront/server/modules/voice_agents_module/.../voice_agents_container.py
Switched service constructor bucket_name args to config.storage.application_bucket instead of per-provider/per-feature config entries.
Controllers & Services
wavefront/server/modules/inference_module/.../inference_controller.py, wavefront/server/modules/knowledge_base_module/.../knowledge_base_document_controller.py, wavefront/server/modules/auth_module/.../outlook_service.py
Removed cloud-provider branching for bucket selection; always read bucket from config['storage']['application_bucket'].
Floware service & tests
wavefront/server/apps/floware/floware/services/config_service.py, wavefront/server/apps/floware/tests/conftest.py
Added storage handling and validation; use application_bucket for storing/generating presigned URLs; tests updated for storage.application_bucket.
Tests / Fixtures
wavefront/server/modules/inference_module/tests/conftest.py, wavefront/server/modules/knowledge_base_module/tests/conftest.py, wavefront/server/modules/user_management_module/tests/conftest.py
Test fixtures updated to provide storage.application_bucket; removed per-cloud asset bucket entries.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • vizsatiz

Poem

🐰 I hopped through configs, tidy and quick,
One bucket to rule them — no more pick-by-pick.
From docs to services, I stitched them in glue,
APPLICATION_BUCKET now holds the whole view. 🪣✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: consolidating multiple bucket environment variables into a single APPLICATION_BUCKET across the entire codebase.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4252ed9 and 174c9a2.

📒 Files selected for processing (1)
  • wavefront/server/apps/floware/floware/services/config_service.py
🔇 Additional comments (2)
wavefront/server/apps/floware/floware/services/config_service.py (2)

39-72: LGTM! Correct usage of unified bucket configuration.

The method correctly uses config_credentials['application_bucket'] for file storage operations, properly replacing the previous bucket reference with the new unified key.


74-90: LGTM! Correct usage of unified bucket configuration.

The method correctly uses config_credentials['application_bucket'] for presigned URL generation, properly replacing the previous bucket reference with the new unified key.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_document_controller.py (1)

304-312: Critical: Inconsistent bucket reference for signed URLs.

While the upload operation at Line 126 correctly uses config['storage']['application_bucket'], this section still uses provider-specific bucket logic (config['gcp']['gcp_asset_storage_bucket'] or config['aws']['aws_asset_storage_bucket']) for generating signed URLs. This inconsistency will cause the signed URL to point to a different bucket than where the file was uploaded, resulting in broken URLs and 404 errors.

🔎 Proposed fix to use unified bucket configuration
     if signed_url:
-        provider = config['cloud_config']['cloud_provider']
-        bucket = (
-            config['gcp']['gcp_asset_storage_bucket']
-            if provider.lower() == 'gcp'
-            else config['aws']['aws_asset_storage_bucket']
-        )
+        bucket = config['storage']['application_bucket']
         presigned_url = cloude_storage_manager.generate_presigned_url(
             bucket, existing_document.file_path, 'GET'
         )
🧹 Nitpick comments (2)
docker-compose.sample.yml (1)

172-190: Clarify cloud provider configuration to prevent confusion.

Both AWS and GCP sections define APPLICATION_BUCKET (lines 178 and 190) within the same environment block. While users should only configure one cloud provider, having duplicate environment variable definitions may cause confusion about which value is active (the last one will override).

Consider one of these approaches:

  1. Add clear comments indicating users should remove or comment out the unused provider's section
  2. Document that CLOUD_PROVIDER determines which bucket is used, but the environment variable duplication is harmless
💡 Suggested clarification with comments
       # AWS Configuration (if CLOUD_PROVIDER=aws)
+      # NOTE: Configure only the section matching your CLOUD_PROVIDER setting
       - AWS_ACCESS_KEY_ID=<YOUR_AWS_ACCESS_KEY>
       - AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_KEY>
       - AWS_REGION=<YOUR_AWS_REGION>
       - AWS_KMS_ARN=<YOUR_AWS_KMS_ARN>
       - AWS_QUEUE_URL=<YOUR_AWS_QUEUE_URL>
       - APPLICATION_BUCKET=<YOUR_APPLICATION_BUCKET>
 
       # GCP Configuration (if CLOUD_PROVIDER=gcp)
+      # NOTE: If using GCP, the APPLICATION_BUCKET below will override the AWS one above
       - GCP_PROJECT_ID=<YOUR_GCP_PROJECT_ID>
       - GCP_LOCATION=<YOUR_GCP_LOCATION>
       - GOOGLE_APPLICATION_CREDENTIALS=/app/credentials/gcp-service-account.json
       - GCP_KMS_KEY_RING=<YOUR_KEY_RING>
       - GCP_KMS_CRYPTO_KEY=<YOUR_CRYPTO_KEY>
       - GCP_KMS_CRYPTO_KEY_VERSION=<YOUR_GCP_KMS_CRYPTO_KEY_VERSION>
       - GCP_GOLD_TOPIC_ID=<YOUR_GOLD_TOPIC>
       - GCP_EMAIL_TOPIC_ID=<YOUR_EMAIL_TOPIC>
       - WORKFLOW_WORKER_TOPIC=<YOUR_WORKFLOW_TOPIC>
       - APPLICATION_BUCKET=<YOUR_APPLICATION_BUCKET>
wavefront/server/modules/insights_module/insights_module/insights_container.py (1)

62-68: Consider data organization strategy within the unified bucket.

Both transcript_bucket_name and audio_bucket_name now point to config.storage.application_bucket. While consolidating to a single bucket simplifies configuration, consider implementing a clear folder/prefix structure within the bucket to maintain logical separation (e.g., application-bucket/transcripts/ and application-bucket/audio/).

Benefits of maintaining logical separation:

  • Granular IAM policies using bucket prefixes
  • Different S3 lifecycle rules per data type
  • Easier cost tracking and analysis
  • Simplified data discovery and management
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b82d5ae and 599ce20.

📒 Files selected for processing (12)
  • DOCKER_SETUP.md
  • docker-compose.sample.yml
  • wavefront/server/apps/floware/floware/config.ini
  • wavefront/server/apps/inference_app/inference_app/config.ini
  • wavefront/server/background_jobs/workflow_job/workflow_job/config.ini
  • wavefront/server/modules/agents_module/agents_module/agents_container.py
  • wavefront/server/modules/gold_module/gold_module/gold_container.py
  • wavefront/server/modules/inference_module/inference_module/controllers/inference_controller.py
  • wavefront/server/modules/insights_module/insights_module/insights_container.py
  • wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_document_controller.py
  • wavefront/server/modules/plugins_module/plugins_module/plugins_container.py
  • wavefront/server/modules/voice_agents_module/voice_agents_module/voice_agents_container.py
🧰 Additional context used
🧬 Code graph analysis (1)
wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_document_controller.py (1)
wavefront/client/server.cjs (1)
  • config (24-28)
🔇 Additional comments (13)
wavefront/server/apps/inference_app/inference_app/config.ini (1)

1-2: LGTM! Configuration consolidation looks clean.

The new unified [storage] section with application_bucket simplifies the configuration by removing provider-specific bucket references. This aligns well with the PR's objective to consolidate bucket environment variables.

wavefront/server/background_jobs/workflow_job/workflow_job/config.ini (1)

10-11: LGTM! Consistent configuration consolidation.

The new [storage] section effectively replaces multiple provider-specific bucket configurations, maintaining consistency with the project-wide bucket consolidation.

wavefront/server/modules/voice_agents_module/voice_agents_module/voice_agents_container.py (1)

89-89: LGTM! Correctly updated to use unified bucket configuration.

The change from config.voice_agents.voice_agent_bucket to config.storage.application_bucket aligns with the project-wide bucket consolidation.

wavefront/server/modules/inference_module/inference_module/controllers/inference_controller.py (1)

39-39: LGTM! Simplified bucket selection logic.

The change from provider-based conditional logic to a unified config['storage']['application_bucket'] simplifies the code while maintaining functionality. This assumes APPLICATION_BUCKET is properly configured for the active cloud provider.

wavefront/server/modules/gold_module/gold_module/gold_container.py (1)

19-19: LGTM! Image services now use unified bucket configuration.

Both AWSImageService and GCPImageService now reference config.storage.application_bucket, ensuring consistent storage configuration across cloud providers.

Also applies to: 26-26

wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_document_controller.py (1)

126-126: LGTM! Upload operation uses unified bucket.

The document upload correctly uses config['storage']['application_bucket'] for storing files.

wavefront/server/modules/plugins_module/plugins_module/plugins_container.py (1)

50-50: LGTM! Plugin services updated to use unified bucket.

Both DynamicQueryService and MessageProcessorService now correctly reference config.storage.application_bucket, maintaining consistency with the project-wide bucket consolidation.

Also applies to: 57-57

docker-compose.sample.yml (1)

407-417: Configuration follows consistent pattern with floware service.

The inference_app service uses the same dual-definition pattern for APPLICATION_BUCKET as the floware service. While this creates the same potential for confusion noted in the floware section review, the consistency is good.

wavefront/server/modules/insights_module/insights_module/insights_container.py (1)

69-75: GCP configuration consistent with AWS approach.

The GCP services follow the same unified bucket pattern as AWS. The operational advice regarding data organization within the bucket (mentioned in the AWS section review) applies equally here.

wavefront/server/modules/agents_module/agents_module/agents_container.py (1)

37-72: Unified bucket configuration applied consistently.

All three services (AgentCrudService, WorkflowCrudService, WorkflowInferenceService) now correctly reference config.storage.application_bucket. The implementation is clean and aligns with the PR's consolidation objective.

As with the insights module, consider organizing agent YAMLs, workflow definitions, and inference data into distinct prefixes within the shared bucket for better data management.

DOCKER_SETUP.md (3)

156-164: Documentation clearly describes unified bucket usage.

The updated description for APPLICATION_BUCKET accurately reflects that it now stores all application data types. The enumeration (agents, transcripts, audio, models, etc.) helps users understand the bucket's scope.


166-177: GCP documentation consistent with AWS.

The GCP APPLICATION_BUCKET description matches the AWS description, providing consistency across cloud providers. Users will clearly understand the bucket serves the same unified purpose on both platforms.


377-414: Bucket creation instructions accurately updated.

The bucket creation sections for both AWS (lines 377-380) and GCP (lines 411-414) have been correctly simplified to create a single APPLICATION_BUCKET. Section headers changed from plural "Buckets" to singular "Bucket" appropriately reflects the new architecture.

Comment on lines +104 to +105
[storage]
application_bucket=${APPLICATION_BUCKET}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify config.ini changes and bucket reference updates

echo "=== Checking config.ini lines 104-105 ==="
cat -n wavefront/server/apps/floware/floware/config.ini | sed -n '100,110p'

echo ""
echo "=== Searching for removed bucket configuration references ==="
echo ""

# Search for old bucket references
echo "agent_yaml_bucket:"
rg -n "agent_yaml_bucket" --type py 2>/dev/null | head -20

echo ""
echo "voice_agent_bucket:"
rg -n "voice_agent_bucket" --type py 2>/dev/null | head -20

echo ""
echo "reference_images_bucket:"
rg -n "reference_images_bucket" --type py 2>/dev/null | head -20

echo ""
echo "aws_asset_storage_bucket:"
rg -n "aws_asset_storage_bucket" --type py 2>/dev/null | head -20

echo ""
echo "gcp_asset_storage_bucket:"
rg -n "gcp_asset_storage_bucket" --type py 2>/dev/null | head -20

echo ""
echo "model_storage_bucket:"
rg -n "model_storage_bucket" --type py 2>/dev/null | head -20

echo ""
echo "transcript_bucket_name:"
rg -n "transcript_bucket_name" --type py 2>/dev/null | head -20

echo ""
echo "audio_bucket_name:"
rg -n "audio_bucket_name" --type py 2>/dev/null | head -20

echo ""
echo "=== Checking application_bucket usage ==="
rg -n "application_bucket" --type py 2>/dev/null | head -30

Repository: rootflo/wavefront

Length of output: 9624


Update bucket configuration references in multiple modules to use new [storage] section.

The config.ini changes are correct, but verification reveals that several code locations still reference the old bucket configuration paths that have been removed:

  • aws_asset_storage_bucket: knowledge_base_document_controller.py:308, outlook_service.py:258
  • gcp_asset_storage_bucket: knowledge_base_document_controller.py:306, outlook_service.py:256, floware/config_service.py:25/47/74
  • model_storage_bucket: inference_app/inference_controller.py:67-70

These references must be updated to use config['storage']['application_bucket'] (or config.storage.application_bucket). Some modules like insights, voice_agents, and plugins have been properly updated, but knowledge_base, auth, and inference_app modules have not.

Additionally, test configuration files contain old bucket references that should be cleaned up for consistency.

🤖 Prompt for AI Agents
In wavefront/server/apps/floware/floware/config.ini around lines 104-105 the
bucket was moved into the [storage] section
(application_bucket=${APPLICATION_BUCKET}); update all code locations still
reading the old config keys to reference the new path: replace uses of
aws_asset_storage_bucket, gcp_asset_storage_bucket, and model_storage_bucket
with config['storage']['application_bucket'] (or
config.storage.application_bucket) — specifically change
knowledge_base_document_controller.py lines ~306-308, outlook_service.py lines
~256-258, floware/config_service.py lines ~25/47/74, and
inference_app/inference_controller.py lines ~67-70 to read from
config['storage']['application_bucket']; also remove or update old bucket keys
in test config files so tests use the new [storage].application_bucket key for
consistency.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
wavefront/server/apps/floware/tests/conftest.py (1)

179-184: LGTM! Configuration consolidation correctly applied.

The storage configuration has been properly updated to use the unified application_bucket. Note that this test uses 'test-bucket' while other test files use 'test_bucket' (underscore). While both are valid for testing, consider standardizing for consistency.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 599ce20 and 4252ed9.

📒 Files selected for processing (8)
  • wavefront/server/apps/floware/floware/services/config_service.py
  • wavefront/server/apps/floware/tests/conftest.py
  • wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py
  • wavefront/server/modules/auth_module/auth_module/services/outlook_service.py
  • wavefront/server/modules/inference_module/tests/conftest.py
  • wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_document_controller.py
  • wavefront/server/modules/knowledge_base_module/tests/conftest.py
  • wavefront/server/modules/user_management_module/tests/conftest.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • wavefront/server/modules/knowledge_base_module/knowledge_base_module/controllers/knowledge_base_document_controller.py
🧰 Additional context used
🧬 Code graph analysis (1)
wavefront/server/modules/auth_module/auth_module/services/outlook_service.py (1)
wavefront/client/server.cjs (1)
  • config (24-28)
🔇 Additional comments (7)
wavefront/server/apps/floware/floware/services/config_service.py (2)

54-54: LGTM!

The change correctly uses application_bucket from the credentials dict, which is populated via the merge in _get_gcp_credentials.


81-81: LGTM!

The change correctly uses application_bucket from the credentials dict, which is populated via the merge in _get_gcp_credentials.

wavefront/server/modules/user_management_module/tests/conftest.py (1)

167-169: LGTM! Configuration consolidation correctly applied.

The addition of storage.application_bucket to the test configuration aligns with the PR objective of consolidating bucket references into a single unified source.

wavefront/server/modules/inference_module/tests/conftest.py (1)

148-148: LGTM! Test configuration properly updated.

The test configuration correctly uses the unified storage.application_bucket structure, aligning with the PR's consolidation objective.

wavefront/server/modules/knowledge_base_module/tests/conftest.py (1)

144-144: LGTM! Test configurations consistently updated.

Both test configuration dictionaries have been properly updated to use the unified storage.application_bucket structure. The consistent use of 'test_bucket' across both configurations is appropriate.

Also applies to: 218-220

wavefront/server/modules/auth_module/auth_module/services/outlook_service.py (1)

255-255: Good consolidation - configuration is complete.

The direct lookup of config['storage']['application_bucket'] removes provider-specific branching logic, aligning with the PR objective. The [storage] section with application_bucket is defined in the main floware/config.ini and the pattern is consistently used across multiple modules (knowledge_base, voice_agents, plugins, insights, inference, gold, agents). No old provider-specific bucket configurations remain in the codebase.

wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py (1)

66-66: Good simplification - configuration structure verified.

The direct lookup of config['storage']['application_bucket'] correctly simplifies the code by removing cloud provider branching. The configuration file has been properly updated with the storage.application_bucket key, and no remnants of old provider-specific bucket configurations remain in the codebase. Exception handling (lines 65-150) catches any potential KeyError from missing config keys.

Comment on lines 22 to 35
storage_config = self.config.get('storage')
if not isinstance(config_credentials, dict):
raise HTTPException(status_code=500, detail='GCP configuration is missing')
if not config_credentials.get(
'gcp_asset_storage_bucket'
) or not config_credentials.get('config_file_name'):
if not isinstance(storage_config, dict):
raise HTTPException(
status_code=500, detail='Storage configuration is missing'
)
if not storage_config.get('application_bucket') or not config_credentials.get(
'config_file_name'
):
raise HTTPException(status_code=500, detail='Incomplete GCP configuration')
# Merge storage config into credentials dict for backward compatibility
config_credentials['application_bucket'] = storage_config['application_bucket']
return config_credentials
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full file to understand the context
fd -t f "config_service.py" -x wc -l {}

Repository: rootflo/wavefront

Length of output: 570


🏁 Script executed:

fd -t f "config_service.py" -x cat -n {}

Repository: rootflo/wavefront

Length of output: 42738


🏁 Script executed:

# Search for old bucket variable references across the codebase
rg "gcp_asset_storage_bucket|GCP_ASSET_STORAGE_BUCKET" -n -C2

Repository: rootflo/wavefront

Length of output: 43


🏁 Script executed:

# Check for other bucket-related configuration references
rg "application_bucket|APPLICATION_BUCKET" -n --type=py -C2

Repository: rootflo/wavefront

Length of output: 15331


🏁 Script executed:

# Look for any config usage patterns that might indicate breaking changes
rg "\.get\('.*bucket" -n --type=py -C1

Repository: rootflo/wavefront

Length of output: 396


Update error message and clarify internal implementation detail.

The refactoring correctly validates the new storage configuration and retrieves application_bucket from it. Two clarifications are needed:

  1. Error message (line 32) needs to reflect both validations: The error states "Incomplete GCP configuration" but the method now validates both gcp.config_file_name and storage.application_bucket. Update to: "Incomplete GCP or storage configuration".

  2. Misleading comment on line 33: The phrase "for backward compatibility" is incorrect. This merge is an internal implementation detail that allows the method to return a flat dictionary containing both credentials and bucket. The codebase migration to application_bucket is complete—no old references remain. Either clarify the comment to describe what the merge actually does (e.g., "to consolidate GCP credentials and storage bucket for callers"), or remove it.

🤖 Prompt for AI Agents
In wavefront/server/apps/floware/floware/services/config_service.py around lines
22 to 35, update the HTTPException detail message that currently reads
"Incomplete GCP configuration" to "Incomplete GCP or storage configuration" to
reflect that both gcp.config_file_name and storage.application_bucket are
validated, and replace the misleading comment "for backward compatibility" with
a concise description of the merge purpose such as "consolidate GCP credentials
and storage bucket into a single dict for callers" (or remove the comment
entirely).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants