diff --git a/docs/adr/37706-quote-env-scalars-containing-colon-space.md b/docs/adr/37706-quote-env-scalars-containing-colon-space.md new file mode 100644 index 00000000000..11577ac920e --- /dev/null +++ b/docs/adr/37706-quote-env-scalars-containing-colon-space.md @@ -0,0 +1,41 @@ +# ADR-37706: Quote env scalars containing `: ` via post-hoc YAML text patching + +**Date**: 2026-06-08 +**Status**: Draft + +## Context + +The compiler emits GitHub Actions lock files (`*.lock.yml`) by assembling YAML largely through string building rather than a structured marshaller. Env values that contain the token sequence `": "` — for example `ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"` — were rendered as plain (unquoted) YAML scalars. YAML parsers interpret an unquoted `: ` inside a scalar as a nested mapping separator, producing invalid YAML that fails to parse. The affected values flow through several independent rendering paths: top-level `env` section extraction, step/job env rendering for engine-generated steps, and the shared `yamlStringValue` helper. A fix needed to make these values valid scalars without altering output for values that do not require quoting. + +## Decision + +We will guarantee that env values containing `": "` are emitted as quoted YAML scalars by centralizing the quoting rule in a shared helper, `quoteYAMLValueContainingColonSpace`, in `pkg/workflow/yaml_env_helpers.go`, and applying it across all env rendering paths. For text-assembled output we use a companion function, `quoteEnvValuesContainingColonSpace`, that post-processes already-generated YAML text line-by-line, detecting direct children of `env:` (and `- env:`) maps by indentation and quoting only their direct values. This in-place text-patching approach is applied to top-level `env` extraction (`frontmatter_extraction_yaml.go`), step/job env rendering (`compiler_yaml_step_conversion.go`), and the `yamlStringValue` helper (`engine_helpers.go`). Values that already begin with a quote, block, or flow indicator, or that do not contain `": "`, are left unchanged. + +## Alternatives Considered + +### Alternative 1: Serialize env blocks with a structured YAML marshaller +Instead of patching generated text, the env maps could be marshalled through a YAML library (e.g. `goccy/go-yaml`), which would quote scalars correctly by construction. This was not chosen because the compiler's rendering paths assemble YAML as strings with bespoke indentation, comment preservation (e.g. `uses:` version comments), and expression passthrough (`${{ ... }}`); routing env through a marshaller would risk reflowing or re-quoting output that other code and golden tests depend on, a far larger and riskier change than the targeted scalar fix. *[TODO: verify whether a localized marshaller for just the env sub-tree was evaluated.]* + +### Alternative 2: Quote every env value unconditionally +Every env value could be wrapped in quotes regardless of content, sidestepping detection logic. This was not chosen because it would change the output for the large set of values that are already valid unquoted, churning existing lock files and golden-file tests and reducing the readability of generated YAML. The chosen approach deliberately preserves output for values that do not require quoting. + +## Consequences + +### Positive +- Compiled lock files with `: `-containing env values (notably custom header values embedding expressions) are now valid, parseable YAML. +- The quoting rule lives in a single shared helper, so the three rendering paths apply identical behavior and future call sites can reuse it. +- Output is unchanged for env values that do not require quoting, minimizing churn to existing lock files and golden tests. + +### Negative +- `quoteEnvValuesContainingColonSpace` parses YAML by string/indentation heuristics rather than a real parser; it assumes compiler-generated, consistently indented input and only rewrites direct `env:` children, so it is brittle to future changes in how env blocks are emitted (nested mappings or reindentation could silently bypass it). +- The same quoting concern is now addressed in multiple places (text post-processing plus `yamlStringValue`), so the invariant "env scalars with `: ` are quoted" is enforced by convention across paths rather than at a single serialization boundary. +- Only the `": "` token is targeted; other YAML-significant scalar contents (leading `@`, `#`, `*`, `&`, trailing spaces) remain the responsibility of separate logic and are not covered by this change. + +### Neutral +- Both `env:` and inline `- env:` list forms are handled, reflecting that env can appear as a mapping key or on a list item. +- A focused regression test (`TestGenerateYAMLWithEnvironmentValueContainingColonSpace`) compiles a workflow with an `engine.env` colon-space value and asserts the output stays quoted and unmarshals as valid YAML. +- Non-string env values fall back to default `%v` formatting and are unaffected by the quoting helper. + +--- + +*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/27112537413) workflow. The PR author must review, complete, and finalize this document before the PR can merge.* diff --git a/pkg/workflow/compiler_yaml_step_conversion.go b/pkg/workflow/compiler_yaml_step_conversion.go index 16372551865..fccca59463a 100644 --- a/pkg/workflow/compiler_yaml_step_conversion.go +++ b/pkg/workflow/compiler_yaml_step_conversion.go @@ -35,6 +35,7 @@ func ConvertStepToYAML(stepMap map[string]any) (string, error) { // Post-process to move version comments outside of quoted uses values // This handles cases like: uses: "slug@sha # v1" -> uses: slug@sha # v1 yamlStr = unquoteUsesWithComments(yamlStr) + yamlStr = quoteEnvValuesContainingColonSpace(yamlStr) // Add 6 spaces to the beginning of each line to match GitHub Actions step indentation lines := strings.Split(strings.TrimSpace(yamlStr), "\n") @@ -153,7 +154,11 @@ func (c *Compiler) renderStepFromMap(out *strings.Builder, step map[string]any, // For complex fields like "with" or "env" — sort keys for stable output. fmt.Fprintf(out, "%s:\n", field) for _, key := range slices.Sorted(maps.Keys(v)) { - fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) + if field == "env" { + fmt.Fprintf(out, "%s %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key])) + } else { + fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) + } } default: fmt.Fprintf(out, "%s: %v\n", field, v) @@ -190,10 +195,22 @@ func (c *Compiler) renderStepFromMap(out *strings.Builder, step map[string]any, // Sort keys for stable output. fmt.Fprintf(out, "%s:\n", field) for _, key := range slices.Sorted(maps.Keys(v)) { - fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) + if field == "env" { + fmt.Fprintf(out, "%s %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key])) + } else { + fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) + } } default: fmt.Fprintf(out, "%s: %v\n", field, v) } } } + +func formatStepEnvValueForYAML(value any) string { + strValue, ok := value.(string) + if !ok { + return fmt.Sprint(value) + } + return yamlStringValue(strValue) +} diff --git a/pkg/workflow/compiler_yaml_test.go b/pkg/workflow/compiler_yaml_test.go index 1992c460229..2ddc202bb11 100644 --- a/pkg/workflow/compiler_yaml_test.go +++ b/pkg/workflow/compiler_yaml_test.go @@ -11,6 +11,7 @@ import ( "testing" "github.com/github/gh-aw/pkg/testutil" + "github.com/goccy/go-yaml" ) func TestCompileWorkflowWithInvalidYAML(t *testing.T) { @@ -791,6 +792,52 @@ Test content.` } } +func TestGenerateYAMLWithEnvironmentValueContainingColonSpace(t *testing.T) { + tmpDir := testutil.TempDir(t, "yaml-env-colon-space-test") + + frontmatter := `--- +name: Test Workflow +on: push +permissions: + contents: read +engine: + id: copilot + env: + ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}" +strict: false +--- + +# Test Workflow + +Test content.` + + testFile := filepath.Join(tmpDir, "test.md") + if err := os.WriteFile(testFile, []byte(frontmatter), 0644); err != nil { + t.Fatal(err) + } + + compiler := NewCompiler() + if err := compiler.CompileWorkflow(testFile); err != nil { + t.Fatalf("CompileWorkflow() error: %v", err) + } + + lockFile := filepath.Join(tmpDir, "test.lock.yml") + content, err := os.ReadFile(lockFile) + if err != nil { + t.Fatalf("Failed to read lock file: %v", err) + } + yamlStr := string(content) + + if !strings.Contains(yamlStr, `ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"`) { + t.Fatalf("Expected quoted env value in generated YAML, got:\n%s", yamlStr) + } + + var parsed map[string]any + if err := yaml.Unmarshal(content, &parsed); err != nil { + t.Fatalf("Generated lock file should be valid YAML: %v\nYAML:\n%s", err, yamlStr) + } +} + // TestGenerateYAMLWithConcurrency tests that concurrency is properly set func TestGenerateYAMLWithConcurrency(t *testing.T) { tmpDir := testutil.TempDir(t, "yaml-concurrency-test") diff --git a/pkg/workflow/engine_helpers.go b/pkg/workflow/engine_helpers.go index 99737d3882d..ce18c6b1c29 100644 --- a/pkg/workflow/engine_helpers.go +++ b/pkg/workflow/engine_helpers.go @@ -276,6 +276,9 @@ func yamlStringValue(value string) string { if len(value) == 0 { return value } + if quoted := quoteYAMLValueContainingColonSpace(value); quoted != value { + return quoted + } // Values starting with YAML flow indicators need quoting to be treated as strings. // '{' would be parsed as a YAML flow mapping, '[' as a YAML flow sequence. first := value[0] diff --git a/pkg/workflow/frontmatter_extraction_yaml.go b/pkg/workflow/frontmatter_extraction_yaml.go index bfdbd76b1a4..261f1d3cd42 100644 --- a/pkg/workflow/frontmatter_extraction_yaml.go +++ b/pkg/workflow/frontmatter_extraction_yaml.go @@ -79,6 +79,12 @@ func (c *Compiler) extractTopLevelYAMLSection(frontmatter map[string]any, key st // which causes validation errors since they start with numbers but contain spaces yamlStr = parser.QuoteCronExpressions(yamlStr) + // For top-level env values, quote plain scalars containing ": " because YAML + // treats this token sequence as a mapping separator in plain style. + if key == "env" { + yamlStr = quoteEnvValuesContainingColonSpace(yamlStr) + } + // Clean up null values - replace `: null` with `:` for cleaner output // GitHub Actions treats `workflow_dispatch:` and `workflow_dispatch: null` identically yamlStr = CleanYAMLNullValues(yamlStr) diff --git a/pkg/workflow/yaml_env_helpers.go b/pkg/workflow/yaml_env_helpers.go index 631a8e7b3b5..73a9416f68e 100644 --- a/pkg/workflow/yaml_env_helpers.go +++ b/pkg/workflow/yaml_env_helpers.go @@ -3,6 +3,7 @@ package workflow import ( "fmt" "sort" + "strconv" "strings" "github.com/github/gh-aw/pkg/logger" @@ -24,6 +25,77 @@ func formatYAMLEnv(indent, key, value string) string { return fmt.Sprintf("%s%s: %q\n", indent, key, value) } +func quoteYAMLValueContainingColonSpace(value string) string { + if value == "" || + strings.HasPrefix(value, "\"") || + strings.HasPrefix(value, "'") || + strings.HasPrefix(value, "|") || + strings.HasPrefix(value, ">") || + strings.HasPrefix(value, "{") || + strings.HasPrefix(value, "[") { + return value + } + if strings.Contains(value, ": ") { + return strconv.Quote(value) + } + return value +} + +// quoteEnvValuesContainingColonSpace patches YAML text in-place for env blocks, +// quoting direct env values that contain ": " so they remain valid scalars. +// +// Assumptions: +// - Input YAML is compiler-generated and consistently indented. +// - We only rewrite direct children of env: maps (not nested mappings). +// - Both "env:" and "- env:" are handled because env can appear either as a +// regular mapping key or inline on a list item in YAML syntax. +func quoteEnvValuesContainingColonSpace(yamlStr string) string { + lines := strings.Split(yamlStr, "\n") + inEnv := false + envIndent := 0 + envChildIndent := -1 + + for i, line := range lines { + trimmed := strings.TrimSpace(line) + if trimmed == "" { + continue + } + + indent := len(line) - len(strings.TrimLeft(line, " ")) + if inEnv && indent <= envIndent { + inEnv = false + envChildIndent = -1 + } + + if trimmed == "env:" || trimmed == "- env:" { + inEnv = true + envIndent = indent + envChildIndent = -1 + continue + } + if !inEnv || indent <= envIndent { + continue + } + if envChildIndent == -1 { + envChildIndent = indent + } + if indent != envChildIndent { + continue + } + + idx := strings.Index(line, ": ") + if idx < 0 { + continue + } + quotedValue := quoteYAMLValueContainingColonSpace(line[idx+2:]) + if quotedValue != line[idx+2:] { + lines[i] = line[:idx+2] + quotedValue + } + } + + return strings.Join(lines, "\n") +} + // writeHeadersToYAML writes a map of headers to YAML format with proper comma placement // indent is the indentation string to use for each header line (e.g., " ") func writeHeadersToYAML(yaml *strings.Builder, headers map[string]string, indent string) {