Skip to content

Conversation

@Shubham-Khichi
Copy link
Contributor

  • Reorganize main sections for better visibility:

    • Processing Status with inline stats
      image
    • Start Exploration full width
      image
    • Discovered Pages full width
      image
    • Extracted Content full width
    • MCP Server and Stored Files side by side
      image
  • Update ProcessingBlock component:

    • Convert stats from grid to horizontal layout
    • Reduce padding and font sizes for compact view
    • Maintain hover and animation effects
    • Keep progress bar and status indicator

Recursive Depth & Lazy Loading:

  • Add max_depth parameter for configurable crawling depth
  • Add depth selector (1-5) in URL input component
    image
  • Implement recursive page discovery with cycle prevention
  • Add wait_for_images and scroll_delay for lazy loading
  • Change wait_until to 'domcontentloaded' for faster initial load
  • Increase page timeout to 120s for complex pages

Internal Links Management:
image

  • Add internal links column to discovered pages
  • Implement expand/collapse functionality for internal links
  • Add automatic selection of internal links with primary page
  • Show internal links count for each discovered page
  • Style internal links with indentation and visual hierarchy

Depth Control Guide:

  1. Level 1 (Quick Overview): Only crawls the main page you enter, perfect for single-page docs

  2. Level 2 (Section Level): Crawls main page + direct links (e.g., if main page links to 'Getting Started' and 'API Reference', it gets those too)

  3. Level 3 (Sub-section Level): Goes one level deeper into each section (e.g., gets individual API endpoints from the API Reference section)

  4. Level 4 (Detailed Level): Crawls even deeper, getting detailed pages and examples (Warning: Can find many pages!)

  5. Level 5 (Complete Crawl): Maximum depth, crawls everything it can find (Warning: Can take longer and find hundreds of pages!)

Smart Features:

  • Automatically handles lazy-loaded content
  • Faster initial page loads
  • Better timeout handling for complex pages
  • Prevents duplicate crawling of the same URLs

Bug Fix:

  1. Fix Select All functionality:

    • Now properly toggles between selecting and unselecting all primary URLs
    • Only affects primary URLs, not internal links
    • Updates checkbox state correctly
  2. Fix individual URL selection:

    • Remove automatic selection of internal links
    • Each URL (primary or internal) can be selected independently
    • Maintain independent selection state for each URL
  3. Update selection counter:

    • Only count primary URLs in total count
    • Show accurate selection status in header

These improvements give users precise control over crawling depth and content selection, making documentation extraction more efficient and manageable.

- Reorganize main sections for better visibility:
  * Processing Status with inline stats
  * Start Exploration full width
  * Discovered Pages full width
  * Extracted Content full width
  * MCP Server and Stored Files side by side

- Update ProcessingBlock component:
  * Convert stats from grid to horizontal layout
  * Reduce padding and font sizes for compact view
  * Maintain hover and animation effects
  * Keep progress bar and status indicator

Recursive Depth & Lazy Loading:
- Add max_depth parameter for configurable crawling depth
- Add depth selector (1-5) in URL input component
- Implement recursive page discovery with cycle prevention
- Add wait_for_images and scroll_delay for lazy loading
- Change wait_until to 'domcontentloaded' for faster initial load
- Increase page timeout to 120s for complex pages

Internal Links Management:
- Add internal links column to discovered pages
- Implement expand/collapse functionality for internal links
- Add automatic selection of internal links with primary page
- Show internal links count for each discovered page
- Style internal links with indentation and visual hierarchy

Depth Control Guide:
1. Level 1 (Quick Overview):
   Only crawls the main page you enter, perfect for single-page docs

2. Level 2 (Section Level):
   Crawls main page + direct links (e.g., if main page links to
   'Getting Started' and 'API Reference', it gets those too)

3. Level 3 (Sub-section Level):
   Goes one level deeper into each section (e.g., gets individual
   API endpoints from the API Reference section)

4. Level 4 (Detailed Level):
   Crawls even deeper, getting detailed pages and examples
   (Warning: Can find many pages!)

5. Level 5 (Complete Crawl):
   Maximum depth, crawls everything it can find
   (Warning: Can take longer and find hundreds of pages!)

Smart Features:
- Automatically handles lazy-loaded content
- Faster initial page loads
- Better timeout handling for complex pages
- Prevents duplicate crawling of the same URLs

Bug Fix:

1. Fix Select All functionality:
   - Now properly toggles between selecting and unselecting all primary URLs
   - Only affects primary URLs, not internal links
   - Updates checkbox state correctly

2. Fix individual URL selection:
   - Remove automatic selection of internal links
   - Each URL (primary or internal) can be selected independently
   - Maintain independent selection state for each URL

3. Update selection counter:
   - Only count primary URLs in total count
   - Show accurate selection status in header

These improvements give users precise control over crawling
depth and content selection, making documentation extraction
more efficient and manageable.
bug fix for header and footer
@Shubham-Khichi Shubham-Khichi added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 11, 2025
@Shubham-Khichi Shubham-Khichi self-assigned this Feb 11, 2025
@vercel
Copy link

vercel bot commented Feb 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
dev-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 11, 2025 11:38pm

@Shubham-Khichi Shubham-Khichi merged commit aa6e8b4 into main Feb 11, 2025
2 checks passed
@Shubham-Khichi Shubham-Khichi deleted the Major-Updates branch February 11, 2025 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants