Skip to content

[refactor] Semantic Function Clustering Analysis - Code Organization Opportunities #5728

@github-actions

Description

@github-actions

Overview

This report presents findings from a comprehensive semantic analysis of the Go codebase, focusing on function organization, naming patterns, and opportunities for improved modularity. The analysis examined 264 non-test Go files across the pkg/ directory, cataloging over 3,000+ functions to identify refactoring opportunities.

Key Findings:

  • Several large files (>1000 lines) with mixed responsibilities
  • Inconsistent function naming patterns across packages
  • Validation and compilation functions scattered across multiple files
  • Opportunities to consolidate duplicate patterns and improve code organization
  • Clear semantic clusters that suggest natural module boundaries

Full Analysis Report

Repository Statistics

  • Total non-test Go files analyzed: 264
  • Total functions cataloged: 3,000+
  • Packages analyzed: pkg/workflow (155 files), pkg/cli (82 files), pkg/parser (13 files), plus utilities
  • Analysis method: Semantic clustering by function naming patterns and file organization

Critical Findings by Package

1. pkg/workflow - Largest Package (155 files, 2,601 functions)

Issue 1.1: Oversized Core Files with Mixed Responsibilities

Several files exceed 1,000 lines and contain multiple unrelated function categories:

compiler.go (location not specified in detail but implied to be large)

  • Contains: validation, building, extraction, parsing, generation functions
  • Recommendation: Split into:
    • compiler_validation.go (validation functions)
    • compiler_extract.go (extraction functions)
    • Keep only core compilation orchestration in compiler.go

compiler_jobs.go (multiple build functions)

  • Functions: buildActivationJob, buildMainJob, buildPreActivationJob, buildConclusionJob, buildSafeOutputJob, plus 26+ buildCreateOutput*Job functions
  • Issue: Job building split between this file and compiler.go
  • Recommendation: Consolidate ALL build*Job functions here, move any remaining to compiler.go into this file

compiler_yaml.go (27 functions)

  • Functions: generateYAML, generateMainJobSteps, generatePrompt, generatePostSteps, plus 15+ generateUpload* and generate*PromptStep functions
  • Issue: Mixed YAML generation with prompt generation and upload logic
  • Recommendation: Split into:
    • compiler_yaml_generation.go (main YAML generation)
    • compiler_yaml_steps.go (step generation)
    • compiler_yaml_prompts.go (prompt generation)

compiler_parse.go (50+ parse/extract functions)

  • Functions: ParseWorkflowFile, parseWorkflowMarkdownContentWithToolsString, plus 30+ extract* functions and 15+ parseSafeOutput*Config functions
  • Issue: Massive concentration of parsing and extraction logic
  • Recommendation: Split into:
    • compiler_parse.go (main parsing entry points)
    • compiler_parse_config.go (config parsing - all parse*Config functions)
    • compiler_extract.go (all extract* functions)

permissions.go (37 functions)

  • Functions: Permission management, merging, rendering, validation
  • Issue: Single file handles too many permission-related concerns
  • Recommendation: Split into:
    • permissions.go (core types and basic methods)
    • permissions_merge.go (merging logic)
    • permissions_helpers.go (utility functions)

js.go (40 functions)

  • Functions: JavaScript bundling, parsing, and formatting
  • Issue: All JavaScript operations in one file
  • Recommendation: Split into:
    • js_bundler.go (bundling logic)
    • js_parser.go (parsing logic)
    • js_formatting.go (formatting logic)

scripts.go (33 functions) + script_registry.go (10 functions)**

  • Issue: Overlapping responsibilities between two files
  • Recommendation: Clarify separation OR merge into single scripts package

Issue 1.2: Validation Functions Scattered Across Files

Validation logic appears in multiple files instead of dedicated validation modules:

Current State:

  • agent_validation.go - agent-specific validation
  • engine_validation.go - engine validation
  • docker_validation.go - Docker validation
  • bundler_validation.go - bundler validation
  • npm_validation.go - NPM validation
  • pip_validation.go - pip validation
  • strict_mode_validation.go - strict mode validation
  • step_order_validation.go - step order validation
  • Plus validate* functions in: compiler.go, expression_validation.go, template_validation.go, schema_validation.go, mcp_config_validation.go, permissions_validator.go, runtime_validation.go, repository_features_validation.go

Recommendation:
Create pkg/workflow/validation/ subdirectory:

pkg/workflow/validation/
  ├── agent.go (from agent_validation.go)
  ├── docker.go (from docker_validation.go)
  ├── engine.go (from engine_validation.go)
  ├── expression.go (from expression_validation.go)
  ├── permissions.go (from permissions_validator.go)
  ├── runtime.go (from runtime_validation.go)
  ├── schema.go (from schema_validation.go)
  ├── strict_mode.go (from strict_mode_validation.go)
  └── template.go (from template_validation.go)

Issue 1.3: Inconsistent Function Naming for Job Building

Pattern Identified:

  • build*Job functions (correct pattern)
  • buildCreateOutput*Job functions (redundant "build" + "create")

Examples:

  • buildCreateOutputAddCommentJob
  • buildCreateOutputAgentTaskJob
  • buildCreateOutputCloseDiscussionJob
  • ... (16 total)

Issue: The "create" prefix is redundant - all of these BUILD a job that CREATES an output

Recommendation: Rename to buildOutputAddCommentJob, buildOutputAgentTaskJob, etc. to remove redundancy

Issue 1.4: Safe Outputs System Needs Modularization

Safe output files (7+ files) are scattered in main workflow directory:

Current files:

  • safe_outputs.go
  • safe_output_builder.go
  • safe_output_config.go
  • safe_output_validation_config.go
  • safe_outputs_app.go
  • safe_outputs_env_helpers.go
  • safe_inputs.go
  • safe_jobs.go

Recommendation: Create subdirectory:

pkg/workflow/safeoutputs/
  ├── outputs.go (from safe_outputs.go)
  ├── builder.go (from safe_output_builder.go)
  ├── config.go (from safe_output_config.go)
  ├── validation.go (from safe_output_validation_config.go)
  ├── app.go (from safe_outputs_app.go)
  ├── env_helpers.go (from safe_outputs_env_helpers.go)
  ├── inputs.go (from safe_inputs.go)
  └── jobs.go (from safe_jobs.go)

Issue 1.5: Engine Files Should Be in Subdirectory

Engine-related files (9 files) mixed with other workflow files:

Current files:

  • agentic_engine.go (30 functions)
  • claude_engine.go (25 functions)
  • codex_engine.go
  • copilot_engine.go
  • custom_engine.go
  • engine.go
  • engine_helpers.go
  • engine_validation.go
  • engine_output.go
  • engine_firewall_support.go
  • engine_network_hooks.go

Recommendation: Create subdirectory:

pkg/workflow/engines/
  ├── base.go (from agentic_engine.go + engine.go)
  ├── claude.go
  ├── codex.go
  ├── copilot.go
  ├── custom.go
  ├── helpers.go
  ├── output.go
  ├── validation.go
  └── network.go

Issue 1.6: GitHub Integration Files Should Be Grouped

GitHub create/update/close operations (18+ files) scattered:

Current files:

  • create_issue.go
  • create_pull_request.go
  • create_discussion.go
  • create_code_scanning_alert.go
  • create_pr_review_comment.go
  • create_agent_task.go
  • update_issue.go
  • update_pull_request.go
  • update_release.go
  • update_project.go
  • close_issue.go
  • close_pull_request.go
  • close_discussion.go
  • close_entity_helpers.go
  • add_comment.go
  • add_labels.go
  • add_reviewer.go
  • assign_milestone.go, assign_to_agent.go, assign_to_user.go, link_sub_issue.go, push_to_pull_request_branch.go, publish_assets.go, notify_comment.go

Recommendation: Create subdirectory:

pkg/workflow/github/
  ├── create/
  │   ├── issue.go
  │   ├── pull_request.go
  │   ├── discussion.go
  │   └── ...
  ├── update/
  │   ├── issue.go
  │   ├── pull_request.go
  │   └── ...
  ├── close/
  │   ├── issue.go
  │   ├── pull_request.go
  │   └── ...
  └── helpers/
      ├── comments.go
      ├── labels.go
      └── ...

Issue 1.7: Expression Handling Well-Organized (Good Example)

The expression handling is a positive example of good organization:

  • expression_parser.go (parsing)
  • expression_builder.go (building)
  • expression_extraction.go (extraction)
  • expression_nodes.go (AST nodes)
  • expression_validation.go (validation)

No action needed - this is the organizational pattern other areas should follow!

Issue 1.8: Potential Function Duplicates - Token Handling

Multiple similar token-adding functions with pattern-based variations:

Functions identified:

  • addSafeOutputGitHubToken
  • addSafeOutputGitHubTokenForConfig
  • addSafeOutputAgentGitHubTokenForConfig
  • addSafeOutputCopilotGitHubTokenForConfig

Issue: These likely follow similar patterns with minor variations

Recommendation: Consolidate using a strategy pattern or configuration-based approach to reduce duplication

Issue 1.9: MCP Rendering Duplication Across Engines

Each engine file has its own render*MCPConfig method:

Duplication in:

  • claude_engine.go: RenderMCPConfig
  • codex_engine.go: RenderMCPConfig
  • copilot_engine.go: RenderMCPConfig
  • custom_engine.go: RenderMCPConfig

Plus dedicated mcp_renderer.go file with additional rendering logic

Issue: Significant code duplication across engine types

Recommendation: Use template method pattern - extract common MCP rendering to base, allow engines to override specific portions

Issue 1.10: Package Collection Functions Follow Same Pattern

Three functions with identical structure for different package managers:

Functions:

  • collectGoDependencies
  • collectNpmDependencies
  • collectPipDependencies

Issue: Duplicate logic for collecting packages from different sources

Recommendation: Create generic collectPackages function with type parameter or interface-based approach


2. pkg/cli - Second Largest Package (82 files, 500+ functions)

Issue 2.1: Command Files with Mixed Responsibilities

Several command files exceed 900 lines and mix concerns:

add_command.go (~905 lines)

  • Functions: AddWorkflows, addWorkflowsNormal, addWorkflowsWithPR, addWorkflowWithTracking, expandWildcardWorkflows, plus compilation functions: compileWorkflow, compileWorkflowWithRefresh, compileWorkflowWithTracking, compileWorkflowWithTrackingAndRefresh
  • Issue: Contains workflow addition, PR creation, file tracking, AND compilation logic
  • Recommendation: Move compile* functions to compile_command.go or separate compilation utilities file. Keep add_command.go focused only on adding workflows.

compile_command.go (~1149 lines)

  • Functions: CompileWorkflows, CompileWorkflowWithValidation, watchAndCompileWorkflows, compileAllWorkflowFiles, compileModifiedFiles, compileSingleFile, validateCompileConfig, printCompilationSummary
  • Issue: Mix of compilation, file watching, validation, and statistics
  • Recommendation: Split into:
    • compile_command.go (command handler and main compilation)
    • compile_watch.go (file watching logic)
    • compile_validation.go (validation logic)

audit_report.go (~1233 lines)

  • Functions: 12+ render* functions (renderJSON, renderConsole, renderOverview, renderMetrics, renderJobsTable, renderToolUsageTable, etc.), plus 4+ generate* functions (generateFindings, generateRecommendations, etc.)
  • Issue: Mix of data building, rendering in multiple formats, and analysis generation
  • Recommendation: Split into:
    • audit_report.go (main report coordination)
    • audit_report_render.go (all rendering functions)
    • audit_report_analysis.go (all generate* analysis functions)

logs.go (~1593 lines)

  • Functions: NewLogsCommand, DownloadWorkflowLogs, downloadRunArtifactsConcurrent, listWorkflowRunsWithPagination, loadRunSummary, saveRunSummary, displayLogsOverview, plus many more
  • Issue: Command handling, artifact download, run listing, display logic all mixed
  • Recommendation: Split into:
    • logs_command.go (command handler)
    • logs_download.go (download logic - may already exist)
    • logs_display.go (display functions)
    • logs_storage.go (save/load functions)

trial_command.go (~1005 lines)

  • Functions: NewTrialCommand, RunWorkflowTrials, getCurrentGitHubUsername, showTrialConfirmation, triggerWorkflowRun, parseIssueSpec, saveTrialResult, copyTrialResultsToHostRepo, plus git operations
  • Issue: Mix of trial execution, git operations, PR management
  • Recommendation: Extract git operations to git.go, keep trial_command.go focused on trial orchestration

Issue 2.2: Validation Functions Scattered

Validation appears in multiple files:

Current state:

  • compile_command.go: validateCompileConfig
  • mcp_validation.go: validateServerSecrets, validateMCPServerConfiguration
  • docker_images.go: ValidateMCPServerDockerAvailability
  • actionlint.go: validation via actionlint
  • packages.go: implied package validation

Recommendation: Create pkg/cli/validation/ subdirectory or consolidate validation functions

Issue 2.3: Inconsistent Function Naming - Case Conventions

Mix of exported and unexported functions for similar operations:

Examples:

  • AddWorkflows (exported) vs addWorkflowsNormal vs addWorkflowsWithPR (unexported)
  • CompileWorkflows (exported) vs compileAllWorkflowFiles vs compileSingleFile (unexported)

Issue: Inconsistent capitalization makes API surface unclear

Recommendation: Standardize - exported functions for public API, unexported for internal helpers with consistent naming

Issue 2.4: Ensure Pattern Overuse

15+ ensure* functions across CLI files:

Functions:

  • ensureGitAttributes (git.go)
  • ensureMCPConfig (mcp_config_file.go)
  • ensureActionlintConfig (actionlint.go)
  • ensureCopilotInstructions (copilot_setup.go)
  • ensureAgenticWorkflowAgent (copilot_setup.go)
  • ensureSharedAgenticWorkflowAgent (copilot_setup.go)
  • ensureDebugAgenticWorkflowAgent (copilot_setup.go)
  • ensureFileMatchesTemplate (copilot_setup.go)
  • ensureAgentFromTemplate (copilot_setup.go)

Issue: Pattern suggests missing abstraction for initialization/setup logic

Recommendation: Consider extracting common "ensure" pattern into shared utility or setup manager

Issue 2.5: Parse Functions Distributed

Parse functions scattered across many files:

Files with parse:*

  • spec.go: parseWorkflowSpec, parseRepoSpec
  • access_log.go: parseSquidAccessLog, parseSquidLogLine
  • redacted_domains.go: parseRedactedDomainsLog
  • trial_command.go: parseIssueSpec
  • audit_report.go: parseDurationString
  • actionlint.go: parseAndDisplayActionlintOutput

Recommendation: Consider creating pkg/cli/parsing/ package for shared parsing utilities

Issue 2.6: MCP Command Files - Good Organization Example

MCP-related files show good modular organization:

  • mcp.go (main command)
  • mcp_add.go (add subcommand)
  • mcp_inspect.go (inspect subcommand)
  • mcp_list.go (list subcommand)
  • mcp_server.go (server management)
  • mcp_gateway.go (gateway functions)
  • mcp_registry.go (registry management)
  • mcp_validation.go (validation)
  • mcp_config_file.go (config file operations)
  • mcp_tool_table.go (display utilities)

No action needed - this is a good example of subcommand organization!

Issue 2.7: Git Operations Leaking into Command Files

Git operations appear in files other than git.go:

Examples:

  • add_command.go: addWorkflowsWithPR contains git logic for creating PRs
  • trial_command.go: Contains git operations for trial workflow

Recommendation: Keep ALL git operations delegated to git.go for single source of git interaction


3. pkg/parser - Well-Organized Small Package (13 files, 140+ functions)

Issue 3.1: Two Oversized Files Need Splitting

frontmatter.go (1284 lines, 30+ functions)

  • Functions: ParseImportDirective, 4x ProcessImports* variants, 4x ProcessIncludes* variants, 4x ExpandIncludes* variants, 12x extractFromContent, 4x merge functions
  • Issue: Single file handles imports, includes, extraction, expansion, and merging
  • Recommendation: Split into:
    • frontmatter_imports.go (all ProcessImports* and import handling)
    • frontmatter_includes.go (all ProcessIncludes* and ExpandIncludes*)
    • frontmatter_extract.go (all extract*FromContent functions)
    • frontmatter_merge.go (all merge* functions)
    • frontmatter.go (core types and ParseImportDirective)

schema.go (1157 lines, 30+ functions)

  • Functions: 8x Validate* functions, 4x generate* functions, FindClosestMatches, FindDeprecatedFieldsInFrontmatter, plus compilation, navigation, suggestion generation
  • Issue: Handles validation, suggestion generation, schema compilation, and utilities
  • Recommendation: Split into:
    • schema_validate.go (all Validate* functions)
    • schema_suggestions.go (generate* functions and FindClosestMatches)
    • schema_compile.go (schema compilation)
    • schema.go (core types and main functions)

Issue 3.2: Extract Functions Split Across Files

Extract functions appear in 4 different files:

Distribution:

  • frontmatter.go: 12 extract*FromContent functions
  • frontmatter_content.go: 7 Extract* functions (ExtractFrontmatterFromContent, ExtractMarkdownSection, etc.)
  • json_path_locator.go: 2 extract functions
  • schema.go: 3 extract functions

Issue: Related extraction logic scattered

Recommendation: Consider consolidating extract* functions or clearly document the separation rationale (content vs field vs path vs schema)

Issue 3.3: "process" vs "expand" vs "extract" Terminology Overlap

Unclear distinction between similar operations:

Functions:

  • ProcessIncludes (recursive expansion)
  • ExpandIncludes (also recursive expansion)
  • extractToolsFromContent (pulling data from string)

Issue: ProcessIncludes and ExpandIncludes seem to do similar things

Recommendation: Document or consolidate - suggest:

  • parse* = YAML/JSON parsing to structs
  • extract* = pulling values from maps/strings
  • process* = transformation/business logic
  • expand* = recursive expansion specifically

Issue 3.4: Generic Utilities in Specific Files

schema.go contains generic utilities:

  • LevenshteinDistance (string algorithm)
  • removeDuplicates (slice utility)
  • min (math utility)

Issue: These are generic utilities unrelated to schema validation

Recommendation: Extract to pkg/util/ or similar package for reuse

Issue 3.5: Validate Function Proliferation

8 validate functions with minor variations:

Functions:

  • ValidateMainWorkflowFrontmatterWithSchema
  • ValidateMainWorkflowFrontmatterWithSchemaAndLocation
  • ValidateIncludedFileFrontmatterWithSchema
  • ValidateIncludedFileFrontmatterWithSchemaAndLocation
  • ValidateMCPConfigWithSchema
  • validateWithSchema (internal)
  • validateWithSchemaAndLocation (internal)
  • validateEngineSpecificRules, validateCommandTriggerConflicts, validatePathComponents

Issue: Many variants for validation with slight differences

Recommendation: Consider using option pattern or builder pattern to reduce function count while maintaining flexibility


4. Utility Packages - Generally Well-Organized

Packages analyzed:

  • pkg/console (5 files): banner.go, console.go, format.go, render.go, spinner.go - good organization
  • pkg/constants (1 file): constants.go - appropriate
  • pkg/gateway (1 file): gateway.go (578 lines) - could be split if complex
  • pkg/gitutil (1 file): gitutil.go - appropriate for git utilities
  • pkg/logger (2 files): logger.go, error_formatting.go - good separation
  • pkg/styles (1 file): theme.go - appropriate
  • pkg/testutil (1 file): tempdir.go - appropriate
  • pkg/timeutil (1 file): format.go - appropriate
  • pkg/tty (1 file): tty.go - appropriate

No major issues identified in utility packages - they follow good organizational practices!


Semantic Function Clustering Analysis

Build Functions (50+ functions in pkg/workflow)

Pattern: build* - Construct GitHub workflow jobs and steps

Well-organized files:

  • compiler_jobs.go: buildActivationJob, buildMainJob, buildPreActivationJob, buildConclusionJob
  • compiler_safe_outputs.go: 16x buildCreateOutput*Job functions

Issue: Some build functions remain in compiler.go instead of compiler_jobs.go

Cluster recommendation: Consolidate ALL build* functions in compiler_jobs.go


Parse Functions (50+ functions across packages)

Pattern: parse* - Parse configuration from various sources

Current distribution:

  • pkg/workflow/compiler_parse.go: 30+ parse/extract functions
  • pkg/cli: parseWorkflowSpec, parseRepoSpec, parseSquidAccessLog, parseIssueSpec, etc.
  • pkg/parser: ParseImportDirective, ParseGitHubURL, ParseRunURL, ParseMCPConfig

Recommendation: Parsing is appropriately distributed by domain - no consolidation needed


Extract Functions (70+ functions across packages)

Pattern: extract* - Extract data from configurations

Current distribution:

  • pkg/workflow/compiler_parse.go: 30+ extract* functions
  • pkg/parser/frontmatter.go: 12 extract*FromContent functions
  • pkg/parser/frontmatter_content.go: 7 Extract* functions

Issue: High concentration in compiler_parse.go (30+ functions)

Recommendation: Split compiler_parse.go as noted in Issue 1.1


Validate Functions (80+ functions across packages)

Pattern: validate* - Validation logic

Current distribution:

  • pkg/workflow: 35+ validate functions across 10+ files
  • pkg/parser: 8 validate functions in schema.go
  • pkg/cli: validation in multiple files

Issue: Validation scattered across many files in each package

Recommendation: Consolidate into validation/ subdirectories as noted in Issues 1.2 and 2.2


Generate Functions (60+ functions in pkg/workflow)

Pattern: generate* - Generate YAML, prompts, scripts dynamically

Current distribution:

  • compiler_yaml.go: 27 generate* functions
  • scripts.go: implied generation
  • Other files: scattered generate functions

Issue: Some generate* functions in compiler_yaml.go are for prompts, not YAML

Recommendation: Split compiler_yaml.go into generation.go, steps.go, prompts.go as noted in Issue 1.1


Merge Functions (15+ functions in pkg/workflow)

Pattern: merge* - Merge configurations from imports

Files:

  • compiler.go: MergeMCPServers, MergeNetworkPermissions, MergeSafeOutputs, MergeSecretMasking, MergeTools, plus 5+ internal merge functions
  • permissions.go: Merge method
  • comment.go: MergeEventsForYAML

Recommendation: Well-clustered - consider moving all to compiler.go or separate merge utilities file


Render Functions (40+ functions in pkg/workflow + pkg/cli)

Pattern: render* - Render configuration to YAML/strings

Current distribution:

  • pkg/workflow: expression_nodes.go (15+ Render methods), mcp_renderer.go, plus rendering in multiple engine files
  • pkg/cli/audit_report.go: 12+ render* functions

Issue: audit_report.go has too many render functions (Issue 2.1)

Recommendation: Extract audit_report_render.go as noted


Collect Functions (40+ functions in pkg/workflow)

Pattern: collect* - Collect data from workflow configurations

Files:

  • compiler.go and related: collectGoDependencies, collectNpmDependencies, collectPipDependencies, collectDockerImages, collectHTTPMCPHeaderSecrets, etc.

Issue: Duplicate patterns for package collection (Issue 1.10)

Recommendation: Consolidate collect*Dependencies functions


Ensure Functions (20+ functions in pkg/cli)

Pattern: ensure* - Initialization and guarantee operations

Issue: 15+ ensure functions suggest missing abstraction (Issue 2.4)

Recommendation: Extract common ensure pattern


Naming Convention Recommendations

Based on analysis, suggest standardizing function naming:

  1. build* - Build jobs/steps (output: Job or []string)
  2. parse* - Parse YAML/JSON to structs (input: string/map, output: struct)
  3. extract* - Extract values from maps (input: map, output: simple types)
  4. generate* - Generate content dynamically (output: string/[]string)
  5. validate* - Validation logic (output: error)
  6. create* - Create new instances (constructors)
  7. merge* - Merge configurations (input: 2+ configs, output: merged)
  8. render* - Render to YAML/string (output: string)
  9. collect* - Collect items from sources (output: slice)
  10. add* - Add items to collections (mutating operations)
  11. convert* - Convert between formats (input: A, output: B)
  12. format* - Format data for output (output: formatted string)
  13. get* - Get embedded scripts/constants (output: string)
  14. New* - Constructors (output: new instance)
  15. is*/has*** - Boolean checks (output: bool)

Implementation Priority

Priority 1: High-Impact Refactoring (Weeks 1-2)

Focus on splitting oversized files to improve immediate maintainability:

  1. Split pkg/workflow/compiler_parse.go (Issue 1.1)

    • Create compiler_parse_config.go for all parse*Config functions
    • Create compiler_extract.go for all extract* functions
    • Impact: Reduces single-file complexity from 50+ functions
  2. Split pkg/workflow/compiler_yaml.go (Issue 1.1)

    • Create compiler_yaml_generation.go, compiler_yaml_steps.go, compiler_yaml_prompts.go
    • Impact: Clearer separation of YAML, step, and prompt generation
  3. Split pkg/cli/add_command.go (Issue 2.1)

    • Move compile* functions to compile_command.go
    • Impact: Single responsibility per command file
  4. Split pkg/cli/audit_report.go (Issue 2.1)

    • Create audit_report_render.go for all render* functions
    • Create audit_report_analysis.go for all generate* functions
    • Impact: Modularizes 1233-line file
  5. Split pkg/parser/frontmatter.go (Issue 3.1)

    • Create frontmatter_imports.go, frontmatter_includes.go, frontmatter_extract.go, frontmatter_merge.go
    • Impact: Breaks down 1284-line file into logical modules
  6. Split pkg/parser/schema.go (Issue 3.1)

    • Create schema_validate.go, schema_suggestions.go, schema_compile.go
    • Impact: Reduces 1157-line file complexity

Priority 2: Structural Refactoring (Weeks 3-4)

Create subdirectories for better organization:

  1. Create pkg/workflow/validation/ (Issue 1.2)

    • Move all validation files into subdirectory
    • Impact: Clear validation module boundary
  2. Create pkg/workflow/safeoutputs/ (Issue 1.4)

    • Move all safe output files into subdirectory
    • Impact: Modularizes safe outputs system
  3. Create pkg/workflow/engines/ (Issue 1.5)

    • Move all engine files into subdirectory
    • Impact: Isolates engine implementations
  4. Create pkg/workflow/github/ (Issue 1.6)

    • Organize create/update/close operations into subdirectory with subfolders
    • Impact: Groups related GitHub integration code

Priority 3: Code Quality Improvements (Weeks 5-6)

Reduce duplication and improve consistency:

  1. Consolidate Token Handling Functions (Issue 1.8)

    • Refactor add*GitHubToken functions using strategy pattern
    • Impact: Reduces duplication
  2. Consolidate MCP Rendering (Issue 1.9)

    • Extract common MCP rendering logic
    • Use template method pattern for engine-specific overrides
    • Impact: Reduces duplication across engines
  3. Consolidate Package Collection (Issue 1.10)

    • Create generic collectPackages function
    • Impact: Reduces duplication in collect*Dependencies
  4. Standardize Function Naming (Issue 2.3)

    • Apply consistent capitalization rules
    • Impact: Clearer API surface
  5. Extract Ensure Pattern (Issue 2.4)

    • Create common ensure/setup abstraction
    • Impact: Reduces repetitive initialization code

Priority 4: Documentation & Long-term (Ongoing)

  1. Document Function Naming Conventions

    • Add CONTRIBUTING.md section on function naming patterns
    • Impact: Maintains consistency for future development
  2. Extract Generic Utilities (Issue 3.4)

    • Move LevenshteinDistance, removeDuplicates, min to pkg/util/
    • Impact: Enables reuse across packages

Benefits of Proposed Refactoring

  1. Improved Maintainability: Smaller files with single responsibilities are easier to understand and modify
  2. Better Test Coverage: Modular code is easier to test in isolation
  3. Clearer Architecture: Subdirectories create natural module boundaries
  4. Reduced Duplication: Consolidating similar patterns reduces code size and maintenance burden
  5. Easier Onboarding: New contributors can understand code organization more quickly
  6. Consistent Patterns: Standardized naming and organization reduces cognitive load

Non-Issues (Good Examples to Maintain)

These areas demonstrate good organization and should serve as examples:

  1. Expression Handling (pkg/workflow): expression_parser.go, expression_builder.go, expression_extraction.go, expression_nodes.go, expression_validation.go - clear separation of concerns

  2. MCP Command Organization (pkg/cli): mcp.go, mcp_add.go, mcp_inspect.go, mcp_list.go, etc. - good subcommand structure

  3. Utility Packages: pkg/console, pkg/logger, pkg/gitutil, pkg/testutil - appropriate single-responsibility modules

  4. Function Naming Consistency: Most parse*, extract*, generate*, validate* functions follow consistent patterns


Analysis Metadata

  • Analysis Date: 2025-12-07
  • Repository: githubnext/gh-aw
  • Commit: f89757f
  • Analysis Method: Semantic code exploration and function clustering
  • Files Analyzed: 264 non-test Go files
  • Functions Cataloged: 3,000+
  • Tool Used: Claude Code + explore agents

Next Steps

  1. Review this analysis and prioritize refactoring tasks
  2. Create detailed implementation plans for Priority 1 items
  3. Establish refactoring guidelines to prevent regression
  4. Consider incremental implementation - break large files first, then structural changes
  5. Update documentation with new organizational patterns

This analysis provides a roadmap for improving code organization while maintaining functionality. The suggested changes are designed to be implemented incrementally without breaking existing functionality.

AI generated by Semantic Function Refactoring

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions