Skip to content

Conversation

VargaJoe
Copy link
Contributor

@VargaJoe VargaJoe commented Sep 18, 2025

SenseNet Index Rebuilder Console Application

Overview

Adds a new standalone console application for rebuilding search indexes with real-time progress tracking and professional logging.

What's Added

New Console Tool: src/Tools/SnIndexRebuilder/

  • Real-time progress tracking with node counts and completion percentage
  • Dual ETA estimation (average and conservative scenarios)
  • Professional logging with Serilog (console + file output)
  • Two rebuild modes: standard clean rebuild vs complete reset

Command Line Interface:

  • dotnet run - Default clean rebuild (recommended)
  • dotnet run --clear-activities - Complete reset with IndexingActivities pre-clearing
  • dotnet run --help - Show help documentation

Files Added:

  • src/Tools/SnIndexRebuilder/Program.cs (413 lines) - Main application
  • src/Tools/SnIndexRebuilder/SnIndexRebuilder.csproj - Project file
  • src/Tools/SnIndexRebuilder/appsettings.json - Configuration
  • docs/index-rebuilder-console.md (308 lines) - Documentation

Benefits

  • Visibility: Live progress instead of silent background processing
  • Planning: Accurate time estimates for maintenance scheduling
  • Troubleshooting: Structured logging for monitoring and diagnostics
  • Flexibility: Multiple rebuild approaches for different scenarios

Provides a professional alternative to existing index rebuilding methods with enhanced operational control.

- Implements standalone SenseNet index rebuilder console application
- Uses IsOuterSearchEngineEnabled special working mode to prevent processing old indexing activities
- Clears legacy indexing activities from database before rebuild
- Successfully rebuilds index from scratch using ClearAndPopulateAllAsync
- Provides progress monitoring and error handling
- Tested successfully: indexed 62,685 nodes in ~28 minutes
- No core SenseNet modifications required - uses existing infrastructure only

Features:
- Service registration using AddSenseNet pattern from integration tests
- Proper repository startup with indexing disabled during initialization
- Automatic cleanup of old IndexingActivities table entries
- Clean index rebuild from current database state
- Comprehensive logging and progress tracking

This approach solves the issue of old indexing activities being processed during
normal repository startup, enabling truly clean index rebuilds.
…ive progress tracking

Features:
- Add dual ETA display showing both average and worst-case time estimates
- Implement IndexingProgressTracker class with advanced progress monitoring
- Add comprehensive CLI argument parsing (--clear-activities, --help)
- Add Serilog integration for dual console+file logging
- Implement two rebuild approaches:
  1. Clean rebuild without clearing activities (default)
  2. Complete clean rebuild with activities table clearing (--clear-activities)
- Add phantom activities issue resolution with TRUNCATE + DBCC CHECKIDENT
- Add index directory clearing to remove cached LastActivityId
- Enhanced error handling and user feedback
- Professional progress display with total node counts and completion times

Technical improvements:
- Real-time progress updates every 100 nodes or 5 seconds
- Worst-case scenario tracking using maximum time per node
- Convergent ETA estimation as process stabilizes
- Structured logging for troubleshooting and monitoring
- Complete SQL identity seed management
- Comprehensive help documentation
@VargaJoe VargaJoe changed the title feat: Enhanced SenseNet Index Rebuilder Console Application with Dual ETA Estimation and Comprehensive Progress Tracking feat: Add SenseNet Index Rebuilder Console Application Sep 18, 2025
@VargaJoe VargaJoe changed the title feat: Add SenseNet Index Rebuilder Console Application SenseNet Index Rebuilder Console Application Sep 18, 2025
- Add System.Collections.Generic for Queue<double>
- Add System.Linq for Average() extension method
- Complete conservative ETA estimation implementation
…build paths

- Extract common functionality into helper methods:
  - ClearIndexingActivitiesAsync() for database cleanup
  - ClearIndexDirectoryAsync() for file system cleanup
  - PerformIndexRebuildAsync() for shared rebuild logic
- Consolidate duplicate progress tracking, error handling, and populator setup
- Reduce code duplication from ~150 lines to ~20 lines of shared logic
- Maintain identical functionality for both --clear-activities and default modes
- Fix Task ambiguity by using fully qualified System.Threading.Tasks.Task
- Refactored Program.cs to eliminate 143 lines of duplicate code
- Extracted helper methods: ClearIndexingActivitiesAsync, ClearIndexDirectoryAsync, PerformIndexRebuildAsync
- Fixed --clear-activities mode by letting ClearAndPopulateAllAsync handle indexing engine startup internally
- Resolved Lucene29 compatibility issue with explicit indexing engine startup
- Updated comprehensive documentation with latest behavioral findings
- Confirmed identity reset behavior: --clear-activities resets to ID 1, default mode preserves counter
- Both modes now work correctly with streamlined codebase
@VargaJoe VargaJoe marked this pull request as ready for review September 19, 2025 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants