[repository-quality] Repository Quality Improvement Report - Large File Refactoring & Maintainability (2026-05-20) #33561
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🎯 Repository Quality Improvement Report - Large File Refactoring & Maintainability
Analysis Date: 2026-05-20
Focus Area: Large File Refactoring & Maintainability
Strategy Type: Custom
Custom Area: Yes — tailored to gh-aw's specific challenge of maintaining several 1000+ line source files
Executive Summary
This analysis reveals that gh-aw has 29 source files exceeding 800 lines, with the largest at 1,191 lines. While the codebase demonstrates excellent test coverage (2:1 test-to-source ratio) and low technical debt (only 17 TODO/FIXME comments), the large file sizes create maintainability challenges:
Cognitive Load: Files like
frontmatter_extraction_yaml.go(1,191 lines, 155 if-statements) andimport_field_extractor.go(1,172 lines, 164 if-statements, 22 for-loops) require significant mental overhead to understand and modify.Inconsistent Test Coverage: While
audit.gohas excellent 1:1 test coverage,frontmatter_extraction_yaml.gohas only 4% test coverage (53 test lines vs 1,191 source lines), suggesting complexity barriers to testing.Validation Guideline Violations: Nine validation files exceed the documented 300-line hard limit (from
scratchpad/validation-refactoring.md), withsafe_outputs_validation_config.goat 452 lines andpermissions_validation.goat 388 lines.Hidden Modularity Opportunities: The codebase shows natural clustering — 106 compiler-related files, 34 audit-related files, 82 safe-output files — but individual files within these clusters remain monolithic.
This report identifies five high-impact refactoring tasks that will reduce cognitive complexity, improve testability, and align with documented architecture guidelines.
Full Analysis Report
Focus Area: Large File Refactoring & Maintainability
Current State Assessment
Metrics Collected:
frontmatter_extraction_yaml.go)Findings
Strengths
Areas for Improvement
Monolithic frontmatter extraction (Priority: High)
frontmatter_extraction_yaml.go: 1,191 lines with 14 functions averaging 85 lines eachcommentOutProcessedFieldsInOnSection)Oversized import processing (Priority: High)
import_field_extractor.go: 1,172 lines with high cyclomatic complexity (164 if-statements, 22 for-loops, 6 switch-statements)Validation guideline violations (Priority: High)
scratchpad/validation-refactoring.mdsafe_outputs_validation_config.go: 452 lines (150% over limit)permissions_validation.go: 388 lines (129% over limit)engine_validation.go: 429 lines (143% over limit)Large CLI commands (Priority: Medium)
forecast.go: 1,142 lines with 28 functions (too many responsibilities)audit.go: 1,132 lines with 452-lineAuditWorkflowRunfunctionCompiler component sprawl (Priority: Medium)
pkg/workflowwithout internal package structurecompiler_yaml_main_job.go: 1,089 lines (30 for-loops, 68 if-statements)Detailed Analysis
File Size Distribution
Complexity Indicators
High Complexity Files (complexity = if + for + switch statements):
frontmatter_extraction_yaml.go: 171 control flow statements, 600-line functionimport_field_extractor.go: 192 control flow statements, 124-line functionaudit.go: 140 control flow statements, 452-line functioncompiler_yaml_main_job.go: 98 control flow statements, 32.9% comment ratio (good)forecast.go: 121 control flow statements, 28 functions (high function count)Test Coverage Gaps
Files with test-to-source ratio < 0.5:
frontmatter_extraction_yaml.go: 0.04 (critical)forecast.go: 0.10 (needs improvement)Root cause: Complex monolithic files are harder to test in isolation.
Validation Architecture Compliance
From
scratchpad/validation-refactoring.md:Current violations: 9 files exceed hard limit, with worst offenders at 452, 429, and 388 lines.
🤖 Tasks for Copilot Agent
NOTE TO PLANNER AGENT: Split the following tasks into individual work items.
Improvement Tasks
Task 1: Split frontmatter_extraction_yaml.go into Focused Modules
Priority: High
Estimated Effort: Large
Focus Area: Large File Refactoring & Maintainability
Description: Refactor
pkg/workflow/frontmatter_extraction_yaml.go(1,191 lines) into smaller, testable modules. The file currently mixes YAML comment manipulation, condition extraction, and GitHub-specific field handling. The 600-linecommentOutProcessedFieldsInOnSectionfunction is a major complexity hotspot.Acceptance Criteria:
pkg/workflow/frontmatter_yaml_comments.go(~300 lines)pkg/workflow/frontmatter_conditions.go(~200 lines)pkg/workflow/frontmatter_github_fields.go(~150 lines)make test-unitto verify)frontmatter_{subdomain}.goCode Region:
pkg/workflow/frontmatter_extraction_yaml.goTask 2: Refactor Validation Files to Comply with 300-Line Guideline
Priority: High
Estimated Effort: Large
Focus Area: Large File Refactoring & Maintainability
Description: Nine validation files violate the documented 300-line hard limit from
scratchpad/validation-refactoring.md. The worst offenders aresafe_outputs_validation_config.go(452 lines, 150% over),engine_validation.go(429 lines, 143% over), andpermissions_validation.go(388 lines, 129% over). Refactor these files following the documented decision tree and naming conventions.Acceptance Criteria:
{domain}_{subdomain}_validation.goscratchpad/validation-refactoring.mdwith new file referencesgo test -v ./pkg/workflow/*validation*)Code Region:
pkg/workflow/safe_outputs_validation_config.go,pkg/workflow/engine_validation.go,pkg/workflow/permissions_validation.go,pkg/workflow/agent_validation.go,pkg/workflow/model_alias_validation.go,pkg/workflow/network_firewall_validation.go,pkg/workflow/repository_features_validation.go,pkg/workflow/safe_outputs_max_validation.go,pkg/cli/run_workflow_validation.goTask 3: Extract Audit Command Rendering Logic into Separate Package
Priority: Medium
Estimated Effort: Medium
Focus Area: Large File Refactoring & Maintainability
Description: The audit command has 34 related files totaling over 15,000 lines, with core files like
audit.go(1,132 lines) andaudit_diff.go(981 lines) mixing command logic, business logic, and rendering. The 452-lineAuditWorkflowRunfunction inaudit.gois a complexity hotspot. Extract rendering concerns into a focused sub-package.Acceptance Criteria:
pkg/cli/audit/renderpackage for all rendering logicaudit_report_render*.gofiles (7 files) to render packageaudit_diff_render.goto render packageaudit.gofrom 1,132 to <600 linesAuditWorkflowRun(452 lines) into 5-7 smaller functionsgo test -v ./pkg/cli/audit*)Code Region:
pkg/cli/audit.go,pkg/cli/audit_diff.go,pkg/cli/audit_report_render*.go,pkg/cli/audit_diff_render.gopkg/cli/audit/
├── render/
│ ├── overview.go (from audit_report_render_overview.go)
│ ├── findings.go (from audit_report_render_findings.go)
│ ├── firewall.go (from audit_report_render_firewall.go)
│ ├── guard.go (from audit_report_render_guard.go)
│ ├── jobs.go (from audit_report_render_jobs.go)
│ ├── tools.go (from audit_report_render_tools.go)
│ ├── diff.go (from audit_diff_render.go)
│ └── base.go (shared rendering utilities)
Task 4: Refactor import_field_extractor.go to Reduce Complexity
Priority: Medium
Estimated Effort: Medium
Focus Area: Large File Refactoring & Maintainability
Description:
pkg/parser/import_field_extractor.go(1,172 lines, 192 control flow statements) handles import resolution, schema validation, and observability configuration. The file has good test coverage (54%) but high cyclomatic complexity makes it difficult to extend. Split into domain-focused modules.Acceptance Criteria:
pkg/parser/import_observability.go(~300 lines)pkg/parser/import_validation.go(~250 lines)import_field_extractor.go(~400 lines)go test -v ./pkg/parser/)Code Region:
pkg/parser/import_field_extractor.goExtraction phase:
Create
pkg/parser/import_observability.go(~300 lines):extractOTLPEndpointsFromObsMapmergeObservabilityConfigsobservabilityImportEndpointstructCreate
pkg/parser/import_validation.go(~250 lines):validateWithImportSchemavalidateObjectInputvalidateGitHubAppJSONRefactor
import_field_extractor.go(keep ~400 lines):importAccumulatorstructprocessImportQueuecore logicprepareFrontmattercomputeImportRelPathTesting phase:
Maintain test coverage:
import_observability_test.goandimport_validation_test.goReduce cyclomatic complexity:
Validation:
make build && make fmtafter first extractiongo test -v ./pkg/parser/after each module extractionmake agent-report-progressbefore committingSuccess criteria: Three focused files (<400 lines each), 60%+ test coverage, <100 control flow statements per file, all tests passing.
Phase 2: Implement linter
Create
pkg/linters/largefile/largefile.go:Phase 3: Add tests
Create
pkg/linters/largefile/largefile_test.go:Phase 4: Integration
Add linter to golangci-lint configuration (
.golangci.yml):Update
make lintto include the new analyzerDocument exemptions for existing large files (add header comment):
Phase 5: Validation
make build && make fmtmake lint- expect warnings on existing large filesmake agent-report-progressSuccess criteria: Guidelines document created, linter implemented and integrated, passing on current codebase with exemptions documented.
Beta Was this translation helpful? Give feedback.
All reactions