Skip to content

Database Seeding Enhancements#154

Open
ghost wants to merge 2 commits intoposeidon/devfrom
poseidon/feature/seed_scan_data
Open

Database Seeding Enhancements#154
ghost wants to merge 2 commits intoposeidon/devfrom
poseidon/feature/seed_scan_data

Conversation

@ghost
Copy link

@ghost ghost commented Aug 26, 2025

Database Seeding Enhancements

Overview

Enhanced the database seeding functionality to create comprehensive sample data including organizations, projects, cameras, models, and scan data with proper relationships and GCP bucket integration.

Key Changes

1. Enhanced seed_database() Function

  • Modified: mindtrace/apps/mindtrace/apps/poseidon/poseidon/backend/database/seed.py
  • Change: Updated the main seeding function to include sample data creation after basic organization/user setup
  • Added: Automatic creation of project, cameras, model, model deployment, and scan data
  • Result: Single command now creates complete test environment

2. Sample Data Integration

  • Added: get_sample_data() function with hardcoded sample scan data
  • Features:
    • 2 sample scans with different statuses (Success/Defective, Success/Healthy)
    • 24 images across 21 cameras (cam1-cam22)
    • 25 weld classifications with various severities and results
    • Proper datetime parsing and data mapping

3. GCP Bucket Integration

  • Added: bucket_name field to ScanImage model
  • Implemented: Proper GCP bucket path structure: {org_id}/{project_id}/test-seed/{scan_id}/{image_name}
  • Updated: get_file_url() method to handle bucket URLs correctly
  • Result: Images are properly configured for GCP storage

4. Comprehensive Model Creation

  • Project: "Sample Inspection Project" with proper organization linking
  • Cameras: 21 cameras (cam1-cam22) with unique names and serial numbers
  • Model: "Sample Inspection Model" with validation status and deployment readiness
  • Model Deployment: Active deployment with health monitoring

5. Data Relationships

  • Established: Proper Beanie Link relationships between all models
  • Implemented: Scan → ScanImage → ScanClassification hierarchy
  • Added: Organization and project associations for all entities
  • Result: Fully connected data model ready for application use

Technical Details

Sample Data Structure

# 2 scans with different characteristics
- Scan 1: Defective result with 24 images and 25 classifications
- Scan 2: Healthy result with 3 images and 3 classifications

# Camera coverage: cam1-cam22 (21 unique cameras)
# Classification types: Healthy, Defective, Burr
# Severity levels: 0.0 to 4.4

GCP Path Structure

gs://paz-test-bucket/{organization_id}/{project_id}/test-seed/{scan_id}/{image_name}

Usage

# Run complete seeding (organization + sample data)
cd mindtrace/apps/mindtrace/apps/poseidon
python seed_db.py

Output

  • Organization: mindtrace (ID: 687a0a02729546165b913abf)
  • Project: Sample Inspection Project (ID: 68adbc33cbf27736ecaaf7ee)
  • Cameras: 21 cameras created
  • Scans: 2 scans with 24 images and 25 classifications
  • Models: 1 model with deployment

Benefits

  1. Complete Test Environment: Single command creates full application state
  2. GCP Integration Ready: Images configured for cloud storage
  3. Realistic Data: Sample data mirrors production scenarios
  4. Relationship Integrity: All models properly linked
  5. Safe Operations: Idempotent seeding (won't create duplicates)

This enhancement provides a robust foundation for testing and development with realistic data that includes all necessary relationships and cloud storage integration.

@ghost ghost requested review from Yasserelhaddar, canelbirlik and uzairali19 August 26, 2025 14:14
@ghost ghost self-assigned this Aug 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant