Skip to content

atenet, ateapi: make system namespace and component Service names configurable#350

Open
Jonathan Jamroga (jjamroga) wants to merge 6 commits into
agent-substrate:mainfrom
jjamroga:jjamroga/atenet-configurable-resource-names
Open

atenet, ateapi: make system namespace and component Service names configurable#350
Jonathan Jamroga (jjamroga) wants to merge 6 commits into
agent-substrate:mainfrom
jjamroga:jjamroga/atenet-configurable-resource-names

Conversation

@jjamroga

@jjamroga Jonathan Jamroga (jjamroga) commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Problem

Several places in the substrate code hardcode the ate-system namespace and the canonical component Service names (atenet-router, dns, api):

  • cmd/atenet/internal/dns/dns.go — reconcile loop's c.Client.Get(...) calls use package-level constants for the Service names and namespace.
  • cmd/atenet/internal/router/status.gogetRouterIP looks up the router's own Service by the literal "atenet-router".
  • cmd/ateapi/internal/controlapi/informer.goAteletInformer is built with informers.WithNamespace("ate-system").
  • internal/ateclient/builder.gokubectl-ate's port-forward helper hardcodes "ate-system" and "api" for Service/pod lookups.

Any deployment whose namespace or component Service names deviate from manifests/ate-install/ silently breaks across these paths.

Summary

Splits the hardcoded values into two categories with different solutions:

  • Service names become configurable flags (--router-service-name, --dns-service-name), defaulting to the canonical values. Deployments that rename Services pass the actual values.
  • Namespace is derived at runtime from the Kubernetes downward API (POD_NAMESPACE env var), with the canonical default as the fallback for non-k8s invocations (tests, local dev). No flag — atelet, ateapi, and atenet always share their pod's namespace in supported topologies, so a separate knob would be dead weight.

A small new internal/installdefaults package owns the canonical defaults so the constants aren't duplicated across dns/, router/, and controlapi/.

What changes

New package internal/installdefaults

Single source of truth for the names that match manifests/ate-install/:

  • SystemNamespace = "ate-system"
  • APIServiceName = "api"
  • RouterServiceName = "atenet-router"
  • DNSServiceName = "dns"
  • PodNamespaceEnv = "POD_NAMESPACE"
  • NamespaceFromPodEnv() helper for the env-var-with-fallback resolution shared by ateapi and atenet.

Removed duplicated constants

  • cmd/atenet/internal/dns/dns.go no longer defines serviceName / systemNamespace.
  • cmd/atenet/internal/router/router.go no longer defines DefaultRouterServiceName.
  • cmd/ateapi/internal/controlapi/informer.go no longer defines DefaultAteletNamespace.

New flags on atenet dns and atenet router

  • atenet dns:
    • --router-service-name (default installdefaults.RouterServiceName)
    • --dns-service-name (default installdefaults.DNSServiceName)
  • atenet router:
    • --router-service-name (default installdefaults.RouterServiceName) — used by /statusz to look up its own ClusterIP.

Downward-API namespace resolution

  • ateapi/main.go and atenet dns resolve their namespace via installdefaults.NamespaceFromPodEnv() instead of taking a flag. The install manifests inject POD_NAMESPACE via downward API.
  • AteletInformer signature gains the namespace parameter; tests updated.

Cleanup in internal/ateclient/builder.go

  • kubectl-ate's port-forward helper now references installdefaults.SystemNamespace and installdefaults.APIServiceName instead of the literals.

Related

A separate gap in the dns-controller's integration paths is tracked in #348 — it only implements the GKE-style kube-dns ConfigMap update; the coredns ConfigMap integration listed in cmd/atenet/internal/dns/README.md under "Integration" isn't implemented. Out of scope for this PR.

Compatibility

Backward compatible. Default values match what was previously hardcoded; the canonical install layout is unchanged.

Test plan

  • go build ./...
  • go vet ./...
  • go test ./cmd/atenet/internal/... ./cmd/ateapi/internal/... ./internal/installdefaults/... (all pass)

The dns-controller (`atenet dns`) and router (`atenet router`) hardcoded
the substrate namespace ("ate-system") and the component Service names
("atenet-router", "dns") from the canonical install manifests under
`manifests/ate-install/`. Deployments that deviate from that layout —
running in a different namespace, renaming the Services, or composing
substrate into a larger install that rewrites resource names — silently
break: the dns-controller can't find atenet-router, the router can't
find itself for /statusz, and the cluster's actor DNS never gets
patched.

Expose the relevant names as flags on the cobra commands and as fields
on `dns.Controller` / `router.RouterConfig`. Defaults match the values
in `manifests/ate-install/` so existing deployments are unaffected:

  atenet dns:
    --system-namespace       (default "ate-system")
    --router-service-name    (default "atenet-router")
    --dns-service-name       (default "dns")

  atenet router:
    --router-service-name    (default "atenet-router")
The atelet pod informer hardcoded `ateletNamespace = "ate-system"`, so
ate-api-server could only locate atelet pods in that namespace.
Deployments that run atelet elsewhere — an alternative install layout
or a larger composition that relocates substrate components — leave the
informer's cache empty and ResumeActor fails with
`found 0 atelet pods on node "<node>", expected 1`.

Promote the constant to an exported default and accept the namespace as
a parameter to `AteletInformer`. Add an `--atelet-namespace` flag on
the ateapi binary (default DefaultAteletNamespace) that callers
override when needed.

@EItanya Eitan Yarmush (EItanya) left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality makes sense. There are some duplicated constants and some env behavior I think we could take advantage of

Comment thread cmd/ateapi/main.go Outdated
Comment thread cmd/atenet/internal/dns/dns.go Outdated
Comment thread cmd/atenet/internal/router/router.go Outdated
…_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.
Jonathan Jamroga (jjamroga) added a commit to jjamroga/substrate that referenced this pull request Jun 30, 2026
…_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.
Same rationale as the prior atelet-namespace change: atenet, atenet-router,
and substrate's CoreDNS live in a single namespace in every supported
deployment topology, so a separate --system-namespace flag was dead
weight. Resolve from the POD_NAMESPACE env var (Kubernetes' downward
API) with installdefaults.SystemNamespace as the fallback for non-k8s
runs.

--router-service-name and --dns-service-name stay as flags because a
subchart deployment renames those Services with a release prefix, and
the binary can't derive that from pod metadata.
…ardcodes

Three follow-ups from the self-review:

- Extract the POD_NAMESPACE-with-SystemNamespace-fallback pattern into
  installdefaults.NamespaceFromPodEnv() so ateapi and atenet share a
  single implementation (also makes a third call site one line instead
  of four if anyone needs one).
- Add installdefaults.PodNamespaceEnv ("POD_NAMESPACE") and APIServiceName
  ("api") so the constant set covers every name in the canonical install
  layout that's referenced by Go code.
- Route internal/ateclient/builder.go's previously-hardcoded "ate-system"
  and "api" lookups through installdefaults, so kubectl-ate's port-forward
  no longer bypasses the new single source of truth.

ate-controller (ServiceAccount), ate-api-server-deployment (Deployment),
and "api.ate-system.svc" (JWT audience) are still hardcoded but their
configurability needs a real flag/discovery story and is out of scope
for this PR.
@jjamroga Jonathan Jamroga (jjamroga) marked this pull request as draft June 30, 2026 14:01
Jonathan Jamroga (jjamroga) added a commit to jjamroga/substrate that referenced this pull request Jun 30, 2026
…_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.
Mirrors the downward-API injection already present on ate-api-server's
container. Without it, atenet's dns subcommand falls back to
installdefaults.SystemNamespace ("ate-system") in every deployment —
the canonical install works by coincidence, but anyone editing this
manifest to deploy substrate in another namespace would silently keep
looking up Services in ate-system.
@jjamroga Jonathan Jamroga (jjamroga) marked this pull request as ready for review June 30, 2026 15:50
Jonathan Jamroga (jjamroga) added a commit to jjamroga/substrate that referenced this pull request Jul 1, 2026
…_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.
Jonathan Jamroga (jjamroga) added a commit to jjamroga/substrate that referenced this pull request Jul 1, 2026
…_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.
Jonathan Jamroga (jjamroga) added a commit to jjamroga/substrate that referenced this pull request Jul 1, 2026
…_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.
Eitan Yarmush (EItanya) pushed a commit to kagent-dev/substrate that referenced this pull request Jul 1, 2026
* atenet: make system namespace and component Service names configurable

The dns-controller (`atenet dns`) and router (`atenet router`) hardcoded
the substrate namespace ("ate-system") and the component Service names
("atenet-router", "dns") from the canonical install manifests under
`manifests/ate-install/`. Deployments that deviate from that layout —
running in a different namespace, renaming the Services, or composing
substrate into a larger install that rewrites resource names — silently
break: the dns-controller can't find atenet-router, the router can't
find itself for /statusz, and the cluster's actor DNS never gets
patched.

Expose the relevant names as flags on the cobra commands and as fields
on `dns.Controller` / `router.RouterConfig`. Defaults match the values
in `manifests/ate-install/` so existing deployments are unaffected:

  atenet dns:
    --system-namespace       (default "ate-system")
    --router-service-name    (default "atenet-router")
    --dns-service-name       (default "dns")

  atenet router:
    --router-service-name    (default "atenet-router")

* ateapi: make atelet namespace configurable via --atelet-namespace

The atelet pod informer hardcoded `ateletNamespace = "ate-system"`, so
ate-api-server could only locate atelet pods in that namespace.
Deployments that run atelet elsewhere — an alternative install layout
or a larger composition that relocates substrate components — leave the
informer's cache empty and ResumeActor fails with
`found 0 atelet pods on node "<node>", expected 1`.

Promote the constant to an exported default and accept the namespace as
a parameter to `AteletInformer`. Add an `--atelet-namespace` flag on
the ateapi binary (default DefaultAteletNamespace) that callers
override when needed.

* chart: pass system namespace and Service names to dns-controller and router

Wire the new flags added in the previous commit through the Helm
templates so the canonical-render defaults are overridden when the chart
is used as a subchart (e.g. the kagent-enterprise composition where
substrate.fullname prefixes all component Service names).

For atenet-dns the dns-controller now receives:
  --system-namespace={{ .Release.Namespace }}
  --router-service-name={{ include "substrate.fullname" (list "atenet-router" .) }}
  --dns-service-name={{ include "substrate.fullname" (list "dns" .) }}

For atenet-router the /statusz lookup gets:
  --router-service-name={{ include "substrate.fullname" (list "atenet-router" .) }}

When the release name equals the chart name ("substrate") these expand
to the canonical bare names, preserving existing behavior for top-level
installs.

* chart: pass --atelet-namespace to ate-api-server

Wire the new ateapi flag from the previous commit through the chart so
the atelet pod informer watches the chart's release namespace by default.
Canonical render (release name "substrate" in namespace "ate-system")
still produces "--atelet-namespace=ate-system", so behavior is unchanged
for top-level installs.

* chart: regenerate manifests/ate-install/ from current Helm chart

Re-runs `make helm-template` so the checked-in render matches the
chart. Brings in rustfs.yaml, the s3-backed atelet storage envvars,
the trimmed valkey manifest, and drops the no-longer-templated
sandboxconfig-gvisor and sandboxconfig-validation manifests.
`make verify-helm-template` now passes.

* review: centralize install defaults, derive atelet namespace from POD_NAMESPACE

Addresses review comments on agent-substrate#350:

- New internal/installdefaults package owns SystemNamespace,
  RouterServiceName, DNSServiceName. dns, router, and controlapi/informer
  drop their duplicate Default* constants and reference installdefaults
  via the matching flag declarations and tests.

- Drop the --atelet-namespace flag on ateapi. The namespace is now
  resolved at startup from the POD_NAMESPACE env var (Kubernetes' downward
  API), falling back to installdefaults.SystemNamespace for non-k8s
  invocations (tests, local dev). atelet and ateapi share a namespace in
  every supported deployment topology, so a separate knob was dead weight.

* review: derive atenet's system namespace from POD_NAMESPACE

Same rationale as the prior atelet-namespace change: atenet, atenet-router,
and substrate's CoreDNS live in a single namespace in every supported
deployment topology, so a separate --system-namespace flag was dead
weight. Resolve from the POD_NAMESPACE env var (Kubernetes' downward
API) with installdefaults.SystemNamespace as the fallback for non-k8s
runs.

--router-service-name and --dns-service-name stay as flags because a
subchart deployment renames those Services with a release prefix, and
the binary can't derive that from pod metadata.

* review: NamespaceFromPodEnv helper, APIServiceName const, ateclient hardcodes

Three follow-ups from the self-review:

- Extract the POD_NAMESPACE-with-SystemNamespace-fallback pattern into
  installdefaults.NamespaceFromPodEnv() so ateapi and atenet share a
  single implementation (also makes a third call site one line instead
  of four if anyone needs one).
- Add installdefaults.PodNamespaceEnv ("POD_NAMESPACE") and APIServiceName
  ("api") so the constant set covers every name in the canonical install
  layout that's referenced by Go code.
- Route internal/ateclient/builder.go's previously-hardcoded "ate-system"
  and "api" lookups through installdefaults, so kubectl-ate's port-forward
  no longer bypasses the new single source of truth.

ate-controller (ServiceAccount), ate-api-server-deployment (Deployment),
and "api.ate-system.svc" (JWT audience) are still hardcoded but their
configurability needs a real flag/discovery story and is out of scope
for this PR.

* chart: render ate-client ServiceAccount in every mode

The JWT install overlay (manifests/ate-install/jwt) references
ate-client.yaml as a top-level resource, but the chart previously
guarded the SA behind {{ if eq .Values.auth.mode "jwt" }} so
render-manifests.sh (mtls) never emitted it. That divergence broke
verify-helm-template after merging the upstream JWT fix that added a
hand-maintained manifests/ate-install/ate-client.yaml.

The SA is harmless in mtls installs (unused), so render it
unconditionally so the chart is the single source of truth.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants