-
Notifications
You must be signed in to change notification settings - Fork 416
compiler: quote env scalars containing : in compiled YAML
#37706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
df36e9d
b69f6b9
0f5110b
d8ee69d
9b44a7a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| # ADR-37706: Quote env scalars containing `: ` via post-hoc YAML text patching | ||
|
|
||
| **Date**: 2026-06-08 | ||
| **Status**: Draft | ||
|
|
||
| ## Context | ||
|
|
||
| The compiler emits GitHub Actions lock files (`*.lock.yml`) by assembling YAML largely through string building rather than a structured marshaller. Env values that contain the token sequence `": "` — for example `ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"` — were rendered as plain (unquoted) YAML scalars. YAML parsers interpret an unquoted `: ` inside a scalar as a nested mapping separator, producing invalid YAML that fails to parse. The affected values flow through several independent rendering paths: top-level `env` section extraction, step/job env rendering for engine-generated steps, and the shared `yamlStringValue` helper. A fix needed to make these values valid scalars without altering output for values that do not require quoting. | ||
|
|
||
| ## Decision | ||
|
|
||
| We will guarantee that env values containing `": "` are emitted as quoted YAML scalars by centralizing the quoting rule in a shared helper, `quoteYAMLValueContainingColonSpace`, in `pkg/workflow/yaml_env_helpers.go`, and applying it across all env rendering paths. For text-assembled output we use a companion function, `quoteEnvValuesContainingColonSpace`, that post-processes already-generated YAML text line-by-line, detecting direct children of `env:` (and `- env:`) maps by indentation and quoting only their direct values. This in-place text-patching approach is applied to top-level `env` extraction (`frontmatter_extraction_yaml.go`), step/job env rendering (`compiler_yaml_step_conversion.go`), and the `yamlStringValue` helper (`engine_helpers.go`). Values that already begin with a quote, block, or flow indicator, or that do not contain `": "`, are left unchanged. | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| ### Alternative 1: Serialize env blocks with a structured YAML marshaller | ||
| Instead of patching generated text, the env maps could be marshalled through a YAML library (e.g. `goccy/go-yaml`), which would quote scalars correctly by construction. This was not chosen because the compiler's rendering paths assemble YAML as strings with bespoke indentation, comment preservation (e.g. `uses:` version comments), and expression passthrough (`${{ ... }}`); routing env through a marshaller would risk reflowing or re-quoting output that other code and golden tests depend on, a far larger and riskier change than the targeted scalar fix. *[TODO: verify whether a localized marshaller for just the env sub-tree was evaluated.]* | ||
|
|
||
| ### Alternative 2: Quote every env value unconditionally | ||
| Every env value could be wrapped in quotes regardless of content, sidestepping detection logic. This was not chosen because it would change the output for the large set of values that are already valid unquoted, churning existing lock files and golden-file tests and reducing the readability of generated YAML. The chosen approach deliberately preserves output for values that do not require quoting. | ||
|
|
||
| ## Consequences | ||
|
|
||
| ### Positive | ||
| - Compiled lock files with `: `-containing env values (notably custom header values embedding expressions) are now valid, parseable YAML. | ||
| - The quoting rule lives in a single shared helper, so the three rendering paths apply identical behavior and future call sites can reuse it. | ||
| - Output is unchanged for env values that do not require quoting, minimizing churn to existing lock files and golden tests. | ||
|
|
||
| ### Negative | ||
| - `quoteEnvValuesContainingColonSpace` parses YAML by string/indentation heuristics rather than a real parser; it assumes compiler-generated, consistently indented input and only rewrites direct `env:` children, so it is brittle to future changes in how env blocks are emitted (nested mappings or reindentation could silently bypass it). | ||
| - The same quoting concern is now addressed in multiple places (text post-processing plus `yamlStringValue`), so the invariant "env scalars with `: ` are quoted" is enforced by convention across paths rather than at a single serialization boundary. | ||
| - Only the `": "` token is targeted; other YAML-significant scalar contents (leading `@`, `#`, `*`, `&`, trailing spaces) remain the responsibility of separate logic and are not covered by this change. | ||
|
|
||
| ### Neutral | ||
| - Both `env:` and inline `- env:` list forms are handled, reflecting that env can appear as a mapping key or on a list item. | ||
| - A focused regression test (`TestGenerateYAMLWithEnvironmentValueContainingColonSpace`) compiles a workflow with an `engine.env` colon-space value and asserts the output stays quoted and unmarshals as valid YAML. | ||
| - Non-string env values fall back to default `%v` formatting and are unaffected by the quoting helper. | ||
|
|
||
| --- | ||
|
|
||
| *This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/27112537413) workflow. The PR author must review, complete, and finalize this document before the PR can merge.* |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -35,6 +35,7 @@ func ConvertStepToYAML(stepMap map[string]any) (string, error) { | |
| // Post-process to move version comments outside of quoted uses values | ||
| // This handles cases like: uses: "slug@sha # v1" -> uses: slug@sha # v1 | ||
| yamlStr = unquoteUsesWithComments(yamlStr) | ||
| yamlStr = quoteEnvValuesContainingColonSpace(yamlStr) | ||
|
|
||
| // Add 6 spaces to the beginning of each line to match GitHub Actions step indentation | ||
| lines := strings.Split(strings.TrimSpace(yamlStr), "\n") | ||
|
|
@@ -153,7 +154,11 @@ func (c *Compiler) renderStepFromMap(out *strings.Builder, step map[string]any, | |
| // For complex fields like "with" or "env" — sort keys for stable output. | ||
| fmt.Fprintf(out, "%s:\n", field) | ||
| for _, key := range slices.Sorted(maps.Keys(v)) { | ||
| fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) | ||
| if field == "env" { | ||
| fmt.Fprintf(out, "%s %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key])) | ||
| } else { | ||
| fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) | ||
| } | ||
| } | ||
| default: | ||
| fmt.Fprintf(out, "%s: %v\n", field, v) | ||
|
|
@@ -190,10 +195,22 @@ func (c *Compiler) renderStepFromMap(out *strings.Builder, step map[string]any, | |
| // Sort keys for stable output. | ||
| fmt.Fprintf(out, "%s:\n", field) | ||
| for _, key := range slices.Sorted(maps.Keys(v)) { | ||
| fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) | ||
| if field == "env" { | ||
| fmt.Fprintf(out, "%s %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key])) | ||
| } else { | ||
| fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key]) | ||
| } | ||
|
Comment on lines
+198
to
+202
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Duplicated 💡 DetailsBoth switch-case arms inside if field == "env" {
fmt.Fprintf(out, "%s %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key]))
} else {
fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key])
}If |
||
| } | ||
| default: | ||
| fmt.Fprintf(out, "%s: %v\n", field, v) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| func formatStepEnvValueForYAML(value any) string { | ||
| strValue, ok := value.(string) | ||
| if !ok { | ||
| return fmt.Sprint(value) | ||
| } | ||
| return yamlStringValue(strValue) | ||
| } | ||
|
Comment on lines
+210
to
+216
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ import ( | |
| "testing" | ||
|
|
||
| "github.com/github/gh-aw/pkg/testutil" | ||
| "github.com/goccy/go-yaml" | ||
| ) | ||
|
|
||
| func TestCompileWorkflowWithInvalidYAML(t *testing.T) { | ||
|
|
@@ -791,6 +792,52 @@ Test content.` | |
| } | ||
| } | ||
|
|
||
| func TestGenerateYAMLWithEnvironmentValueContainingColonSpace(t *testing.T) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/tdd] This test validates the 💡 Additional test cases to addTop-level frontmatter := `---
name: Test Workflow
on: push
env:
ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"
---
# Test`
// Assert lock file contains quoted ANTHROPIC_CUSTOM_HEADERS and is valid YAML.Step-level Covering all three injection sites in the test suite ensures the fix is verified end-to-end across every code path that this PR modifies. |
||
| tmpDir := testutil.TempDir(t, "yaml-env-colon-space-test") | ||
|
|
||
| frontmatter := `--- | ||
| name: Test Workflow | ||
| on: push | ||
| permissions: | ||
| contents: read | ||
| engine: | ||
| id: copilot | ||
| env: | ||
| ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}" | ||
| strict: false | ||
| --- | ||
|
|
||
| # Test Workflow | ||
|
|
||
| Test content.` | ||
|
|
||
| testFile := filepath.Join(tmpDir, "test.md") | ||
| if err := os.WriteFile(testFile, []byte(frontmatter), 0644); err != nil { | ||
| t.Fatal(err) | ||
| } | ||
|
|
||
| compiler := NewCompiler() | ||
| if err := compiler.CompileWorkflow(testFile); err != nil { | ||
| t.Fatalf("CompileWorkflow() error: %v", err) | ||
| } | ||
|
|
||
| lockFile := filepath.Join(tmpDir, "test.lock.yml") | ||
| content, err := os.ReadFile(lockFile) | ||
| if err != nil { | ||
| t.Fatalf("Failed to read lock file: %v", err) | ||
| } | ||
| yamlStr := string(content) | ||
|
|
||
| if !strings.Contains(yamlStr, `ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"`) { | ||
| t.Fatalf("Expected quoted env value in generated YAML, got:\n%s", yamlStr) | ||
| } | ||
|
|
||
| var parsed map[string]any | ||
| if err := yaml.Unmarshal(content, &parsed); err != nil { | ||
| t.Fatalf("Generated lock file should be valid YAML: %v\nYAML:\n%s", err, yamlStr) | ||
| } | ||
| } | ||
|
|
||
| // TestGenerateYAMLWithConcurrency tests that concurrency is properly set | ||
| func TestGenerateYAMLWithConcurrency(t *testing.T) { | ||
| tmpDir := testutil.TempDir(t, "yaml-concurrency-test") | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,6 +3,7 @@ package workflow | |
| import ( | ||
| "fmt" | ||
| "sort" | ||
| "strconv" | ||
| "strings" | ||
|
|
||
| "github.com/github/gh-aw/pkg/logger" | ||
|
|
@@ -24,6 +25,77 @@ func formatYAMLEnv(indent, key, value string) string { | |
| return fmt.Sprintf("%s%s: %q\n", indent, key, value) | ||
| } | ||
|
|
||
| func quoteYAMLValueContainingColonSpace(value string) string { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/tdd] 💡 Suggested table-driven unit testfunc TestQuoteYAMLValueContainingColonSpace(t *testing.T) {
cases := []struct{ in, want string }{
{"plain", "plain"},
{"x-gw: ${{ repo }}", `"x-gw: ${{ repo }}"`},
{`"already-quoted: val"`, `"already-quoted: val"`}, // no double-quoting
{"", ""},
{"no-colon-space", "no-colon-space"},
{"{flow: map}", "{flow: map}"}, // leading { is left alone
{"trailing: ", `"trailing: "`},
}
for _, c := range cases {
got := quoteYAMLValueContainingColonSpace(c.in)
if got != c.want {
t.Errorf("input %q: got %q, want %q", c.in, got, c.want)
}
}
}The "already-quoted" and prefix-guarded cases are especially important: a silent regression here would produce double-quoted output that is invalid YAML but hard to trace back to this function. |
||
| if value == "" || | ||
| strings.HasPrefix(value, "\"") || | ||
| strings.HasPrefix(value, "'") || | ||
| strings.HasPrefix(value, "|") || | ||
| strings.HasPrefix(value, ">") || | ||
| strings.HasPrefix(value, "{") || | ||
| strings.HasPrefix(value, "[") { | ||
| return value | ||
| } | ||
| if strings.Contains(value, ": ") { | ||
| return strconv.Quote(value) | ||
| } | ||
| return value | ||
| } | ||
|
|
||
| // quoteEnvValuesContainingColonSpace patches YAML text in-place for env blocks, | ||
| // quoting direct env values that contain ": " so they remain valid scalars. | ||
| // | ||
| // Assumptions: | ||
| // - Input YAML is compiler-generated and consistently indented. | ||
| // - We only rewrite direct children of env: maps (not nested mappings). | ||
| // - Both "env:" and "- env:" are handled because env can appear either as a | ||
| // regular mapping key or inline on a list item in YAML syntax. | ||
| func quoteEnvValuesContainingColonSpace(yamlStr string) string { | ||
| lines := strings.Split(yamlStr, "\n") | ||
| inEnv := false | ||
| envIndent := 0 | ||
| envChildIndent := -1 | ||
|
|
||
| for i, line := range lines { | ||
| trimmed := strings.TrimSpace(line) | ||
| if trimmed == "" { | ||
| continue | ||
| } | ||
|
|
||
| indent := len(line) - len(strings.TrimLeft(line, " ")) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/diagnose] 💡 One-line fixindent := len(line) - len(strings.TrimLeft(line, " \t"))Or, for full Unicode whitespace correctness: trimmedLeft := strings.TrimLeftFunc(line, unicode.IsSpace)
indent := len(line) - len(trimmedLeft) |
||
| if inEnv && indent <= envIndent { | ||
| inEnv = false | ||
| envChildIndent = -1 | ||
| } | ||
|
|
||
| if trimmed == "env:" || trimmed == "- env:" { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/diagnose] 💡 Alternative guard// Matches bare "env:" and "env: # comment" while rejecting
// "env: some_inline_value" (which the compiler never emits for env blocks).
if trimmed == "env:" || (strings.HasPrefix(trimmed, "env:") && (len(trimmed) == 4 || trimmed[4] == |
||
| inEnv = true | ||
| envIndent = indent | ||
| envChildIndent = -1 | ||
| continue | ||
| } | ||
| if !inEnv || indent <= envIndent { | ||
| continue | ||
| } | ||
| if envChildIndent == -1 { | ||
| envChildIndent = indent | ||
| } | ||
| if indent != envChildIndent { | ||
| continue | ||
| } | ||
|
|
||
| idx := strings.Index(line, ": ") | ||
| if idx < 0 { | ||
| continue | ||
| } | ||
| quotedValue := quoteYAMLValueContainingColonSpace(line[idx+2:]) | ||
| if quotedValue != line[idx+2:] { | ||
| lines[i] = line[:idx+2] + quotedValue | ||
| } | ||
| } | ||
|
|
||
| return strings.Join(lines, "\n") | ||
| } | ||
|
|
||
| // writeHeadersToYAML writes a map of headers to YAML format with proper comma placement | ||
| // indent is the indentation string to use for each header line (e.g., " ") | ||
| func writeHeadersToYAML(yaml *strings.Builder, headers map[string]string, indent string) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[/diagnose] The
if field == "env"quoting branch is duplicated here and again around line 198 in the "remaining fields" loop. If the quoting strategy changes (e.g., to also handlewith:values), both branches need to be updated in sync — a likely source of future drift.💡 Extract a shared helper
Then both loops call
c.writeMapEntry(out, indent, field, key, v[key])— one update point for any future changes.