-
Notifications
You must be signed in to change notification settings - Fork 250
Description
Overview
The file pkg/parser/import_processor.go has grown to 1104 lines, well above the project's healthy threshold of 800 lines. It currently handles five distinct concerns in a single file: import parsing, BFS traversal, content extraction/merging, cycle detection, and remote import context. This task involves splitting it into smaller, focused files with clear boundaries.
Current State
- File:
pkg/parser/import_processor.go - Size: 1104 lines
- Test file: No dedicated
import_processor_test.go— coverage is spread across 9 separate test files (import_cache_test.go,import_cycle_test.go,import_error_test.go,import_remote_nested_test.go,import_syntax_test.go,import_topological_test.go, etc.) - Complexity: High — single BFS loop (lines 182–847) handles 20+ config merge operations; highly repetitive extract-merge pattern repeated ~15–20 times
Full File Analysis
Function / Type Inventory
| Symbol | Visibility | ~Lines | Responsibility |
|---|---|---|---|
ImportsResult |
exported struct | 20–52 | Aggregates all merged config fields from imports |
ImportInputDefinition |
exported struct | 59–65 | Input parameter definition |
ImportSpec |
exported struct | 68–74 | Single import spec (path + inputs) |
importLog |
var | 17 | Logger instance |
remoteImportOrigin |
unexported struct | 94–99 | Tracks GitHub repo context for nested imports |
importQueueItem |
unexported struct | 102–109 | BFS queue item with context |
ProcessImportsFromFrontmatter |
exported func | 82 | Public entry point |
ProcessImportsFromFrontmatterWithManifest |
exported func | 171 | Extended entry point with manifest |
ProcessImportsFromFrontmatterWithSource |
exported func | 177 | Extended entry point with source tracking |
parseRemoteOrigin |
unexported func | 117 | Parses workflowspec format into remote origin |
processImportsFromFrontmatterWithManifestAndSource |
unexported func | 182–847 | Core BFS loop — 665 lines |
findCyclePath |
unexported func | 851 | DFS helper to reconstruct cycle path |
dfsForCycle |
unexported func | 877 | DFS traversal for cycle detection |
topologicalSortImports |
unexported func | 918 | Kahn's algorithm topological sort |
extractImportPaths |
unexported func | 1072 | Extracts import paths from frontmatter |
Duplicate / Repetitive Patterns
Pattern 1 — Extract-from-Content (lines 635–804, ~170 lines)
Repeated ~15–20 times for different config sections (engines, steps, services, runtimes, permissions, safe-outputs, etc.):
xContent, err := extractXFromContent(string(content))
if err == nil && xContent != "" && xContent != "{}" {
xBuilder.WriteString(xContent + "\n")
}Pattern 2 — Array Deduplication with Set (lines 694–787, ~4 occurrences)
Identical dedup logic for bots, skip-roles, skip-bots, plugins, labels arrays.
Pattern 3 — Section Extraction (lines 288–295, 507–515, 933–938)
#Section suffix stripped from import path in 3 separate places.
Pattern 4 — Relative-to-Remote Path Resolution (lines 521–550, 506–587)
Duplicated logic converting relative import paths to remote GitHub paths.
Refactoring Strategy
Proposed File Splits
-
import_processor.go(trimmed to entry points only)- Exported structs:
ImportsResult,ImportInputDefinition,ImportSpec - Exported entry points:
ProcessImportsFromFrontmatter,ProcessImportsFromFrontmatterWithManifest,ProcessImportsFromFrontmatterWithSource - Internal entry:
processImportsFromFrontmatterWithManifestAndSource(trimmed) - Estimated LOC: ~200
- Exported structs:
-
import_traversal.go(BFS queue and core traversal logic)- Types:
importQueueItem,remoteImportOrigin - Functions:
parseRemoteOrigin, BFS loop body (currently lines 182–634) - Responsibility: Breadth-first traversal, queue management, remote context propagation
- Estimated LOC: ~350
- Types:
-
import_merger.go(content extraction and merging)- Functions: all
extractXFromContentcall sites consolidated with a helper, dedup logic,stringBuilderMergeutility - Responsibility: Merge 20+ config sections from imported files into
ImportsResult - Estimated LOC: ~250
- Functions: all
-
import_graph.go(cycle detection and topological sort)- Functions:
topologicalSortImports,findCyclePath,dfsForCycle - Responsibility: Dependency graph, cycle detection, ordering
- Estimated LOC: ~200
- Functions:
-
import_paths.go(path parsing and resolution)- Functions:
extractImportPaths, section-suffix parsing, relative-to-remote resolution helpers - Responsibility: All import path manipulation and resolution logic
- Estimated LOC: ~100
- Functions:
Shared Utilities to Extract
mergeStringArrayUniq(dst *[]string, src []string)— eliminates 4 copies of the set-based dedup loopextractSection(path string) (cleanPath, section string)— eliminates 3 copies of#Sectionsuffix parsing
Interface Abstractions
- Consider extracting an
importResolverinterface to decouple local vs. remote file fetching, improving unit testability without real filesystem/network access.
Test Coverage Plan
Existing tests cover behaviour end-to-end via the 9 test files. After splitting, verify each new file independently:
-
import_paths_test.goextractImportPathswith empty/nil frontmatter- Section suffix parsing (
workflow#Section) - Relative-to-remote path conversion
-
import_graph_test.gotopologicalSortImportswith linear, branching, diamond graphsfindCyclePath/dfsForCyclewith simple and complex cycles- Already partially covered by
import_cycle_test.goandimport_topological_test.go
-
import_merger_test.gomergeStringArrayUniqdeduplication helper- Merging of individual config sections (tools, steps, engines)
- Empty/nil content handling
-
import_traversal_test.go- Remote origin parsing (
parseRemoteOrigin) - BFS queue ordering
- Already partially covered by
import_remote_nested_test.go
- Remote origin parsing (
Target: ≥80% coverage for all new files; all existing tests must continue to pass.
Implementation Guidelines
- Preserve Behaviour: All existing
ProcessImports*functions must return identical results - Maintain Exports:
ImportsResult,ImportInputDefinition,ImportSpec, and the threeProcessImports*functions stay exported and unchanged - Add Tests First: Write unit tests for extracted helpers before moving code
- Incremental Splits: Split one module at a time, running
make test-unitafter each - Extract Helpers First: Start with
mergeStringArrayUniqandextractSectionto reduce repetition before splitting files - Build Tags: Every new
*_test.gofile must start with//go:build !integration - Linting: Run
make lintafter each split — watch for unused helper warnings
Acceptance Criteria
-
import_processor.gosplit into 4–5 focused files - Each new file is under 400 lines
-
mergeStringArrayUniqandextractSectionhelpers extracted and tested - All tests pass (
make test-unit) - No breaking changes to public API
- Code passes linting (
make lint) - Build succeeds (
make build)
Additional Context
- Repository Guidelines: See
AGENTS.md— "Prefer many smaller files grouped by functionality" - Validation Complexity: Target 100–200 lines per file, hard limit 300 (same principle applies here)
- Testing: Match build-tag and table-driven patterns in existing
pkg/parser/*_test.gofiles - Workflow run: §22308275991
Priority: Medium
Effort: Medium (well-defined boundaries, no API changes, existing test coverage provides safety net)
Expected Impact: Improved maintainability, easier code navigation, reduced merge conflicts, clearer ownership of each concern
Generated by Daily File Diet
- expires on Feb 25, 2026, 1:36 PM UTC