Skip to content

feat(main): add observed versions of EtcdMembers#340

Open
Andrey Kolkov (androndo) wants to merge 1 commit into
mainfrom
feat/etcd-member-versions
Open

feat(main): add observed versions of EtcdMembers#340
Andrey Kolkov (androndo) wants to merge 1 commit into
mainfrom
feat/etcd-member-versions

Conversation

@androndo

@androndo Andrey Kolkov (androndo) commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Observe the etcd version each member is actually running (via the Maintenance Status RPC against the member's own endpoint) into EtcdMember.status.version, surfaced as the "Running" print column. Add an informational VersionDrifted condition comparing the observed running version against the intended spec.version; it is left unset when intent is not yet known. Observation is best-effort and never gates readiness. Extract the operator-side dial config out of discoverMemberID into a shared etcdDialConfig helper.

Implements #338.

Summary by CodeRabbit

  • New Features

    • Added visibility into the actual etcd version running on each member.
    • Exposed a new status field and kubectl column so cluster state is easier to inspect at a glance.
    • Added an informational version-drift condition to show whether the running version matches the desired version.
  • Bug Fixes

    • Improved status reporting so version checks are best-effort and won’t block member readiness.
    • Preserved the last known version when live version checks fail.

Observe the etcd version each member is actually running (via the
Maintenance Status RPC against the member's own endpoint) into
EtcdMember.status.version, surfaced as the "Running" print column.
Add an informational VersionDrifted condition comparing the observed
running version against the intended spec.version; it is left unset when
intent is not yet known. Observation is best-effort and never gates
readiness. Extract the operator-side dial config out of discoverMemberID
into a shared etcdDialConfig helper.

Implements #338.

Signed-off-by: Andrey Kolkov <androndo@gmail.com>
@github-actions github-actions Bot added api-change controllers documentation Improvements or additions to documentation feature New feature or request labels Jul 3, 2026
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds observed etcd running-version tracking to EtcdMember: a new status.version field and MemberVersionDrifted condition, CRD schema and printer-column updates, an EtcdClusterClient.Status method, controller logic to observe version best-effort and set drift conditions, plus tests and documentation.

Changes

Observed member version and drift detection

Layer / File(s) Summary
API type and CRD schema
api/v1alpha2/etcdmember_types.go, charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml
Adds MemberVersionDrifted constant, Version field to EtcdMemberStatus, a "Running" printer column, and matching CRD schema/column entries.
Etcd client Status method
controllers/etcd_client.go
Adds Status(ctx, endpoint) to EtcdClusterClient interface for fetching a member's server status/version.
Controller version observation and drift logic
controllers/etcdmember_controller.go
Introduces a ready flag gating version observation, a new etcdDialConfig helper reused by discoverMemberID and observeVersion, and logic to set/clear MemberVersionDrifted based on spec vs. observed version, with best-effort failure handling.
Fake etcd Status support and controller tests
controllers/testing_helpers_test.go, controllers/etcdmember_controller_test.go
Adds statusVersion/statusErr fields and Status method to fakeEtcd, plus tests covering version observation, drift detection/resolution, unknown-intent handling, and non-fatal observation errors.
Documentation
docs/concepts.md
Adds an "Observed member version" section describing intent-vs-runtime version tracking and drift semantics.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Reconciler as EtcdMemberReconciler
  participant DialConfig as etcdDialConfig
  participant EtcdClient as EtcdClusterClient
  participant Status as EtcdMember.Status

  Reconciler->>Reconciler: updateStatus (ready=true)
  Reconciler->>DialConfig: etcdDialConfig(cluster)
  DialConfig-->>Reconciler: tlsCfg, user, pass
  Reconciler->>EtcdClient: Status(ctx, endpoint)
  EtcdClient-->>Reconciler: StatusResponse or error
  alt Status succeeds
    Reconciler->>Status: set Version, compare to spec.version
    Reconciler->>Status: set MemberVersionDrifted condition
  else Status fails
    Reconciler->>Status: keep previous Version (best-effort)
  end
Loading

Possibly related issues

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding observed runtime versions to EtcdMember resources.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/etcd-member-versions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the capability to observe and report the actual running etcd version for each member. It adds a Version field to EtcdMemberStatus and a VersionDrifted condition to detect intent-vs-reality version drift. The member controller now queries the member's etcd endpoint via the Maintenance Status API when the member is ready. These changes are accompanied by updated CRD definitions, comprehensive unit tests, and updated documentation. I have no feedback to provide as there are no review comments.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
controllers/etcdmember_controller.go (1)

1273-1300: 🚀 Performance & Scalability | 🔵 Trivial | 💤 Low value

Optional: version observation opens a fresh etcd connection every reconcile.

observeVersion dials a new etcd client (dial config + EtcdClientFactory + Status) on every Ready reconcile per member, purely to read a version that changes only across upgrades. At the 30s cadence this is tolerable, but you could reduce connection churn by observing on a longer interval, or only re-observing when a version change is plausible (e.g., after a Pod restart/image change). Best-effort behavior and error handling here look correct.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controllers/etcdmember_controller.go` around lines 1273 - 1300,
`observeVersion` is opening a new etcd client on every Ready reconcile, which
creates unnecessary connection churn. Update
`EtcdMemberReconciler.observeVersion` to only run the
`etcdDialConfig`/`EtcdClientFactory`/`Status` flow when a version change is
actually plausible, or move this check to a longer-lived/less frequent path so
it does not happen every reconcile. Keep the existing best-effort error handling
and return the previous `member.Status.Version` on failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@controllers/etcdmember_controller.go`:
- Around line 1273-1300: `observeVersion` is opening a new etcd client on every
Ready reconcile, which creates unnecessary connection churn. Update
`EtcdMemberReconciler.observeVersion` to only run the
`etcdDialConfig`/`EtcdClientFactory`/`Status` flow when a version change is
actually plausible, or move this check to a longer-lived/less frequent path so
it does not happen every reconcile. Keep the existing best-effort error handling
and return the previous `member.Status.Version` on failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d7cc27f5-e658-4dd5-84a7-647669aeff81

📥 Commits

Reviewing files that changed from the base of the PR and between 1bf58a7 and ad944d6.

📒 Files selected for processing (7)
  • api/v1alpha2/etcdmember_types.go
  • charts/etcd-operator/crd-bases/etcd-operator.cozystack.io_etcdmembers.yaml
  • controllers/etcd_client.go
  • controllers/etcdmember_controller.go
  • controllers/etcdmember_controller_test.go
  • controllers/testing_helpers_test.go
  • docs/concepts.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change controllers documentation Improvements or additions to documentation feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant