Skip to content

compiler: quote env scalars containing : in compiled YAML#37706

Open
Copilot wants to merge 4 commits into
mainfrom
copilot/fix-compiler-env-var-quoting
Open

compiler: quote env scalars containing : in compiled YAML#37706
Copilot wants to merge 4 commits into
mainfrom
copilot/fix-compiler-env-var-quoting

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 8, 2026

engine.env values containing : were emitted as plain YAML scalars in lock files, which YAML parsers interpret as nested mappings and reject. This change ensures those env values are emitted as quoted scalars so compiled workflows remain parseable.

  • Root cause addressed

    • Env rendering paths allowed plain scalar output for values containing : (notably engine/job env), producing invalid lock YAML in generated workflows.
  • Compiler changes

    • Centralized env value quoting logic for : -containing scalar values in shared YAML env helpers.
    • Applied the same quoting behavior across:
      • top-level env section extraction/serialization
      • step/job env rendering paths used by engine-generated steps
    • Kept output behavior unchanged for values that do not require quoting.
  • Regression coverage

    • Added a focused compiler YAML test for engine.env with:
      • ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"
    • Asserts the compiled lock output keeps the value quoted and valid YAML.
# before (invalid)
env:
  ANTHROPIC_CUSTOM_HEADERS: x-aw-gw-github-repo: ${{ github.repository }}

# after (valid)
env:
  ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"

Copilot AI and others added 2 commits June 8, 2026 02:13
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix quoting of env var values in compiled YAML compiler: quote env scalars containing : in compiled YAML Jun 8, 2026
Copilot AI requested a review from pelikhan June 8, 2026 02:16
@pelikhan pelikhan marked this pull request as ready for review June 8, 2026 02:24
Copilot AI review requested due to automatic review settings June 8, 2026 02:24
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 8, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 8, 2026

🧪 Test Quality Sentinel completed test quality analysis.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 8, 2026

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 8, 2026

PR Code Quality Reviewer completed the code quality review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a YAML compilation edge case where env values containing the token sequence ": " were emitted as plain scalars in generated .lock.yml workflows, which YAML parsers can misinterpret as nested mappings and reject. The change centralizes quoting logic and applies it across multiple env-rendering paths to keep compiled workflows parseable.

Changes:

  • Added shared helpers to quote env scalar values containing ": " when they would otherwise be emitted in plain style.
  • Applied the env quoting behavior to top-level env extraction and step YAML generation paths.
  • Added a regression test that validates the generated lock file both contains the quoted env line and successfully unmarshals as YAML.
Show a summary per file
File Description
pkg/workflow/yaml_env_helpers.go Introduces shared quoting utilities and a YAML post-processor for env: blocks.
pkg/workflow/frontmatter_extraction_yaml.go Applies env quoting post-processing when extracting the top-level env section.
pkg/workflow/engine_helpers.go Ensures yamlStringValue quotes scalars containing ": " for YAML safety.
pkg/workflow/compiler_yaml_test.go Adds a regression test ensuring engine.env with ": " compiles to valid quoted YAML.
pkg/workflow/compiler_yaml_step_conversion.go Applies env quoting post-processing to marshalled steps and adds step-env formatting helper.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 5/5 changed files
  • Comments generated: 1

Comment on lines +210 to +216
func formatStepEnvValueForYAML(value any) string {
strValue, ok := value.(string)
if !ok {
return fmt.Sprint(value)
}
return quoteYAMLValueContainingColonSpace(strValue)
}
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 8, 2026

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (147 new lines in pkg/workflow/) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/37706-quote-env-scalars-containing-colon-space.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff. The key decision it captures: fixing : -containing env scalars via post-hoc YAML text patching + a shared quoting helper, rather than switching to a structured YAML marshaller.
  2. Complete the missing sections — resolve the [TODO: verify] note, confirm the alternatives reflect what you actually weighed, and refine the rationale.
  3. Commit the finalized ADR to docs/adr/ on your branch.
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-37706: Quote env scalars containing colon-space

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

This change picks a specific strategy (text patching over structured serialization) with real trade-offs around brittleness and enforcing an invariant across multiple rendering paths. ADRs create a searchable, permanent record of why the codebase looks the way it does, so future contributors understand why env quoting is handled this way rather than at a single serialization boundary.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number.

🔒 This gate is advisory: the ADR must be linked in the PR body before merge.

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · 84.6 AIC · ⌖ 10.5 AIC ·

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 8, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 100/100 — Excellent

Analyzed 1 test: 1 design (behavioral contract), 0 implementation, 0 guideline violations.

📊 Metrics & Test Classification (1 test analyzed)
Metric Value
New/modified tests analyzed 1
✅ Design tests (behavioral contracts) 1 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 1 (100%)
Duplicate test clusters 0
Test inflation detected No
🚨 Coding-guideline violations 0

Test Classification Details

Test File Classification Issues Detected
TestGenerateYAMLWithEnvironmentValueContainingColonSpace pkg/workflow/compiler_yaml_test.go ✅ Design None

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 1 test — unit (//go:build !integration)

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). The single new test verifies an end-to-end behavioral contract: that env values containing : are correctly quoted in compiled YAML output and that the resulting lock file is valid YAML.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system’s behavioral contract — the promises it makes to its users and collaborators.

🧪 Test quality analysis by Test Quality Sentinel · 205.9 AIC · ⌖ 18.7 AIC ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 100/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). The new test TestGenerateYAMLWithEnvironmentValueContainingColonSpace is a well-structured behavioral contract test with proper build tag, error-path coverage, and end-to-end YAML validity assertions.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose and /tdd — approving with non-blocking suggestions.

📋 Key Themes & Highlights

Key Themes

  • Test coverage gaps: The regression test covers only the engine.env path; the top-level env: and step-level env: paths (which this PR also modifies) are not independently verified
  • DRY opportunity: Identical if field == "env" quoting branches appear in two loops in renderStepFromMap — a future maintenance risk
  • Brittle indent scanner: quoteEnvValuesContainingColonSpace uses space-only TrimLeft and an exact "env:" match — both are safe today but fragile under changed assumptions

Positive Highlights

  • ✅ Excellent PR description: root cause, fix scope, and before/after clearly stated
  • ✅ Centralized the quoting logic in yaml_env_helpers.go rather than scattering it across call sites
  • ✅ Regression test validates both the quoted form and YAML round-trip — a solid double check
  • quoteYAMLValueContainingColonSpace correctly guards against double-quoting already-quoted values
  • yamlStringValue integration is clean: early return preserves existing single-quote path for {/[ values

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 320.9 AIC · ⌖ 13.9 AIC

return fmt.Sprintf("%s%s: %q\n", indent, key, value)
}

func quoteYAMLValueContainingColonSpace(value string) string {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] quoteYAMLValueContainingColonSpace is the core of this fix but has no isolated unit tests — edge cases are only exercised indirectly through the end-to-end integration test.

💡 Suggested table-driven unit test
func TestQuoteYAMLValueContainingColonSpace(t *testing.T) {
    cases := []struct{ in, want string }{
        {"plain", "plain"},
        {"x-gw: ${{ repo }}", `"x-gw: ${{ repo }}"`},
        {`"already-quoted: val"`, `"already-quoted: val"`}, // no double-quoting
        {"", ""},
        {"no-colon-space", "no-colon-space"},
        {"{flow: map}", "{flow: map}"},   // leading { is left alone
        {"trailing: ", `"trailing: "`},
    }
    for _, c := range cases {
        got := quoteYAMLValueContainingColonSpace(c.in)
        if got != c.want {
            t.Errorf("input %q: got %q, want %q", c.in, got, c.want)
        }
    }
}

The "already-quoted" and prefix-guarded cases are especially important: a silent regression here would produce double-quoted output that is invalid YAML but hard to trace back to this function.

continue
}

indent := len(line) - len(strings.TrimLeft(line, " "))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] strings.TrimLeft(line, " ") only strips spaces, so tab-indented input would compute indent = 0 for every line, causing the env-detection state machine to silently skip all quoting. The doc comment says "consistently indented" — worth hardening anyway so the function does not silently no-op when that assumption breaks.

💡 One-line fix
indent := len(line) - len(strings.TrimLeft(line, " \t"))

Or, for full Unicode whitespace correctness:

trimmedLeft := strings.TrimLeftFunc(line, unicode.IsSpace)
indent := len(line) - len(trimmedLeft)

envChildIndent = -1
}

if trimmed == "env:" || trimmed == "- env:" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] trimmed == "env:" is an exact match, so a line like env: # production headers — valid YAML and plausible in hand-edited generated output — would silently fall through, leaving its child values unquoted. Matching with trimmed == "env:" || strings.HasPrefix(trimmed, "env: #") (or a general HasPrefix(trimmed, "env:")) would be more resilient.

💡 Alternative guard
// Matches bare "env:" and "env: # comment" while rejecting
// "env: some_inline_value" (which the compiler never emits for env blocks).
if trimmed == "env:" || (strings.HasPrefix(trimmed, "env:") && (len(trimmed) == 4 || trimmed[4] == 

fmt.Fprintf(out, "%s:\n", field)
for _, key := range slices.Sorted(maps.Keys(v)) {
fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key])
if field == "env" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The if field == "env" quoting branch is duplicated here and again around line 198 in the "remaining fields" loop. If the quoting strategy changes (e.g., to also handle with: values), both branches need to be updated in sync — a likely source of future drift.

💡 Extract a shared helper
// writeEnvMapEntry writes a single key: value line, quoting the value when needed for env fields.
func (c *Compiler) writeMapEntry(out *strings.Builder, indent, field, key string, val any) {
    if field == "env" {
        fmt.Fprintf(out, "%s    %s: %s\n", indent, key, formatStepEnvValueForYAML(val))
    } else {
        fmt.Fprintf(out, "%s    %s: %v\n", indent, key, val)
    }
}

Then both loops call c.writeMapEntry(out, indent, field, key, v[key]) — one update point for any future changes.

}
}

func TestGenerateYAMLWithEnvironmentValueContainingColonSpace(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] This test validates the engine.env path well, but two other paths added in this PR go untested: the top-level env: section (via frontmatter_extraction_yaml.go) and the step-level env: (via renderStepFromMap). Each passes through different code, so a future regression in one path would not be caught by the other test.

💡 Additional test cases to add

Top-level env: section — add this alongside the existing test or as a separate function:

frontmatter := `---
name: Test Workflow
on: push
env:
  ANTHROPIC_CUSTOM_HEADERS: "x-aw-gw-github-repo: ${{ github.repository }}"
---
# Test`
// Assert lock file contains quoted ANTHROPIC_CUSTOM_HEADERS and is valid YAML.

Step-level env: via a run step with env — verifies the renderStepFromMap path. If the engine emits step-level env with : values, a test fixture that triggers that path would lock it in.

Covering all three injection sites in the test suite ensures the fix is verified end-to-end across every code path that this PR modifies.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The colon-space quoting fix is correct and the test validates the core scenario. Two non-blocking concerns with the renderStepFromMap path.

Findings

formatStepEnvValueForYAML partial quoting (medium)

formatStepEnvValueForYAML only quotes strings containing ": ", but the writeYAMLEnv/formatYAMLEnv convention in this package is to unconditionally %q-quote all env values. Values that look like booleans (true), null, or numbers will be emitted unquoted through renderStepFromMap (the redact_secrets.go path) and subject to YAML type coercion. The pre-existing %v path had the same gap, but a dedicated named formatter creates a correctness contract that should be complete.

Duplicated env-handling blocks (low)

The if field == "env" guard is copy-pasted into both switch-case arms of renderStepFromMap. A future change to formatStepEnvValueForYAML must be mirrored in both places.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • proxy.golang.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "proxy.golang.org"

See Network Configuration for more information.

🔎 Code quality review by PR Code Quality Reviewer · 43.2 AIC · ⌖ 13.3 AIC

if !ok {
return fmt.Sprint(value)
}
return quoteYAMLValueContainingColonSpace(strValue)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatStepEnvValueForYAML only quotes for ": ", leaving other YAML-unsafe values unprotected — this diverges from the writeYAMLEnv/formatYAMLEnv %q-always convention already established in this package.

💡 Details and suggested fix

This new helper only quotes a string when it contains ": ". That leaves these cases silently broken in the renderStepFromMap path (used by redact_secrets.go):

env value rendered YAML parses as
"true" KEY: true boolean true
"null" KEY: null null
"1.5" KEY: 1.5 float
"line1\nline2" (no ": ") KEY: line1 + bare line2 on next line malformed YAML

The existing writeYAMLEnv / formatYAMLEnv in this package use fmt.Sprintf("%q", value) to unconditionally double-quote — that convention is safer and already pays for itself in the codebase. The new function should follow it:

func formatStepEnvValueForYAML(value any) string {
    strValue, ok := value.(string)
    if !ok {
        return fmt.Sprint(value)
    }
    // Always double-quote, consistent with writeYAMLEnv/formatYAMLEnv in this package.
    // This handles ": ", booleans, null, numeric-looking strings, and embedded newlines.
    return strconv.Quote(strValue)
}

The pre-existing %v path in renderStepFromMap had the same gaps, but introducing a dedicated, named function implies correctness — that contract should be complete.

Comment on lines +198 to +202
if field == "env" {
fmt.Fprintf(out, "%s %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key]))
} else {
fmt.Fprintf(out, "%s %s: %v\n", indent, key, v[key])
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated if field == "env" block — identical logic appears at lines 157–161 in the ordered-field switch and again here in the remaining-field switch.

💡 Details

Both switch-case arms inside renderStepFromMap now contain:

if field == "env" {
    fmt.Fprintf(out, "%s    %s: %s\n", indent, key, formatStepEnvValueForYAML(v[key]))
} else {
    fmt.Fprintf(out, "%s    %s: %v\n", indent, key, v[key])
}

If formatStepEnvValueForYAML is updated (e.g., to fix the %q gap noted in the adjacent comment), the fix must be applied in both places. Extract a helper or consolidate the map[string]any handling to a single shared function to eliminate the duplication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

compiler: env var values containing : are not quoted in compiled YAML, producing invalid output

3 participants