Agentic Workflow Lock File Statistics - December 4, 2024 #5484
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it was created by an agentic workflow more than 3 days ago. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This comprehensive analysis examines all 104 .lock.yml files in the repository, revealing insights into workflow structure, trigger patterns, safe outputs, permissions, and configuration preferences across the gh-aw agentic workflow ecosystem.
Key Highlights:
Full Statistical Report
Executive Summary
File Size Distribution
Size Statistics:
The majority of lock files (61.5%) fall in the 300-400 KB range, indicating a consistent level of complexity across workflows.
Engine Distribution
Key Finding: Copilot is the most widely adopted engine, suggesting strong integration with GitHub's AI capabilities.
Trigger Analysis
Most Popular Triggers
Common Trigger Combinations
Insight: The dominance of workflow_dispatch indicates workflows are designed for flexibility, allowing both automated and manual execution.
Schedule Patterns
Top 10 most common cron schedules:
0 9 * * *0 0,6,12,18 * * *0 8 * * *0 10 * * 1-517 3 * * *0 9 * * 1-50 9 * * 10 6 * * 00 15 * * 1Pattern: Most scheduled workflows run daily in the morning (UTC timezone), with some high-frequency monitoring every 6 hours and weekly reports on Mondays.
Safe Outputs Analysis
Safe Output Types Distribution
Total workflows with safe outputs: 88 (84.6% of all workflows)
Example Workflows by Safe Output Type
create-discussion (reporting and analysis):
create-issue (actionable findings):
add-comment (feedback and suggestions):
create-pull-request (automated improvements):
Safe Output Limits Configuration
Pattern: Most workflows use
max=1to create a single, consolidated output rather than multiple separate items. This prevents spam and keeps repository clean.Discussion Categories
Observation: "audits" is the most popular category (46% of discussion-creating workflows), indicating strong focus on monitoring and analysis.
Close-Older Pattern
close-older-discussions: trueclose-older-issues: trueThis pattern keeps the discussion list clean by automatically closing superseded reports, ensuring only the latest analysis is visible.
Structural Characteristics
Job and Step Complexity
Distribution:
Typical Lock File Structure:
A standard .lock.yml file in this repository has:
Timeout Configuration
Average Timeout: 11 minutes
Pattern: Most workflows (277 uses) default to 10-minute timeout, with a secondary group using 20 minutes for more complex operations.
Permission Patterns
Most Common Permissions
Permission Scope Analysis
Security Posture: The overwhelming majority (84.6%) of workflows use read-only base permissions, only requesting write access for specific resources (issues, pull-requests) through safe-output mechanisms. This demonstrates strong security practices.
Tool & Configuration Patterns
Tool Usage
Observation: Bash is the most essential tool (59.6%), while cache-memory (40.4%) indicates significant use of persistent state for tracking trends and history.
Strict Mode Usage
Definition: Strict mode enforces stricter validation and security checks in workflow execution.
Pattern: About one-third of workflows use strict mode, typically for security-sensitive operations like malicious code scanning, file access, and token management.
Concurrency Controls
Finding: Very few workflows implement concurrency controls, suggesting most workflows are designed to run independently without conflicts.
Common Imports
Top 10 shared imports:
Key Finding:
shared/reporting.mdis imported by 38.5% of workflows, indicating strong standardization around reporting formats. This promotes consistency across analyses.Workflow Categories
By Naming Patterns
By Purpose (inferred from names and safe outputs)
Monitoring & Auditing (30+ workflows): Continuous monitoring of repository health, security, and quality
Code Review & PR Analysis (15+ workflows): Automated code review, PR feedback, and analysis
Documentation (10+ workflows): Documentation generation, updates, and validation
Issue Management (10+ workflows): Issue triage, classification, and assignment
Testing & Smoke Tests (8+ workflows): Automated testing and health checks
Metrics & Analytics (10+ workflows): Data collection, analysis, and visualization
Maintenance & Cleanup (5+ workflows): Repository maintenance and cleanup tasks
Interesting Findings
Copilot Dominance: 40.4% of workflows use the Copilot engine, significantly more than Claude (25%) or Codex (7.7%). This suggests strong integration with GitHub's native AI capabilities and possibly better performance or cost characteristics for this use case.
Manual Override Pattern: 77.9% of workflows support
workflow_dispatch, even when scheduled. This design pattern enables developers to manually trigger workflows for debugging, testing, or ad-hoc analysis without waiting for scheduled runs.Discussion-First Reporting: 39 workflows (44.3% of those with safe outputs) use
create-discussionas their primary output mechanism, with most (89.7%) automatically closing older discussions. This creates a "single source of truth" pattern where the latest analysis is always the most visible.Security-Conscious Design: 84.6% of workflows use read-only base permissions, only granting write access through controlled safe-output mechanisms. This demonstrates defense-in-depth security practices.
Standardized Reporting: The
shared/reporting.mdimport is used by 40 workflows (38.5%), indicating strong standardization around reporting formats and structures across the repository.Timeout Standardization: 277 workflow steps use the 10-minute timeout, with 104 using 20 minutes. This bimodal distribution suggests two classes of operations: quick checks and deeper analysis.
Morning UTC Scheduling: Most scheduled workflows run in the morning UTC hours (8-10 AM), optimizing for European working hours while overnight for US timezones.
Size Consistency: 87.5% of lock files fall in the 200-400 KB range, indicating consistent complexity across workflows despite diverse purposes.
Strict Mode Adoption: Only 35.6% of workflows use strict mode, suggesting it's reserved for security-sensitive operations rather than being a default setting.
Low Concurrency Needs: Only 2 workflows (1.9%) implement concurrency controls, indicating workflows are designed to be independent and non-conflicting.
Recommendations
Based on this statistical analysis, here are recommendations for improving agentic workflow practices:
For New Workflow Authors
For Repository Maintainers
For Platform Development
Methodology
Data Collection
.lock.ymlfiles in.github/workflows/directoryAnalysis Scripts
Analysis performed using multiple specialized scripts:
analyze_lockfiles.sh- Primary data extractionextract_detailed.sh- Engine, tool, and configuration extractionsafe_output_analysis.sh- Safe output pattern analysisfinal_stats.sh- Schedule patterns and trigger combinationsData Quality
Limitations
Cache Persistence
Analysis scripts and results saved to
/tmp/gh-aw/cache-memory/for:Generated by Lockfile Statistics Analysis Agent on 2024-12-04T00:00:00Z
Beta Was this translation helpful? Give feedback.
All reactions