Add draft project security threat-model document by potiuk · Pull Request #13293 · apache/cloudstack

potiuk · 2026-05-30T00:13:57Z

Summary

This PR adds an initial draft of a project-level security
threat-model document (draft-THREAT-MODEL.md) so that automated
security scanners running against this repository have a
maintainer-facing reference for which classes of findings are
in-scope vs. out-of-scope for the project.

The document follows the rubric format used by several other ASF
projects piloting improved security-model discoverability for
agentic scanners. Every claim carries a provenance tag:

(documented) — paraphrased from public artefacts (this repo or
the project website), cited inline.
(inferred) — synthesised from code structure or domain
knowledge; the PMC has not confirmed.
(maintainer) — confirmed by a CloudStack PMC member in response
to this draft. (Zero in this initial draft.)

Draft stats:

~88 documented claims
~64 inferred claims (each maps to a §14 question)
38 open questions for maintainers in §14

§14 is the highest-leverage section: answering each question
either promotes one (inferred) tag to (maintainer) or corrects
the underlying claim.

Why "draft-" prefix?

The file is named draft-THREAT-MODEL.md rather than
SECURITY-THREAT-MODEL.md because this is a proposal for the
PMC to review — please correct, reject, or discuss as needed.
Once the PMC ratifies (or substantially edits) the content, the
file can be renamed in a follow-up PR and a discoverability
scaffold (AGENTS.md → SECURITY.md → the model) added so
scanners can mechanically follow the chain.

What this is, and what it is not

This is not a security audit. It is a working triage document
— the reference a triager holds against an inbound report to
decide whether the report is about a CloudStack vulnerability or
about caller misuse / operator misconfiguration / an out-of-scope
concern.

The draft was generated by an automated agentic security scan
being piloted by the ASF Security team; the discoverability work
is independent of any specific scan run.

How to review

§14 first. Each answer either confirms one (inferred) tag or
replaces the inferred claim with the correct one.
After that, please skim §3 (out-of-scope) and §13 (triage
dispositions) — those govern how a vulnerability report would
be triaged.

Reply edits / corrections inline on the PR, or to the original
security@apache.org thread, whichever fits the PMC's workflow.

🤖 Generated with Claude Code

Adds a draft project-level security threat-model document (draft-THREAT-MODEL.md) at repo root, improving discoverability for automated security scanners running against this repository. The file follows the rubric format used by several other ASF projects piloting security-model discoverability. The "draft-" prefix signals this is a proposal for the PMC to review, correct, or reject — not a finalised maintainer-blessed model. Every claim carries a provenance tag (documented / inferred / maintainer) so reviewers can see where each claim originates; §14 collects open questions for the maintainers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-30T00:22:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 18.76%. Comparing base (7308dad) to head (dcc71cd).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #13293      +/-   ##
============================================
+ Coverage     18.10%   18.76%   +0.65%     
- Complexity    16752    17974    +1222     
============================================
  Files          6037     6160     +123     
  Lines        542796   552571    +9775     
  Branches      66456    67346     +890     
============================================
+ Hits          98291   103705    +5414     
- Misses       433460   437459    +3999     
- Partials      11045    11407     +362

Flag	Coverage Δ
uitests	`3.53% <ø> (+0.01%)`	⬆️
unittests	`19.96% <ø> (+0.68%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Markdown / typos / table-shape fixes per the CI lint output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

yadvr · 2026-06-01T07:19:43Z

There's a lot of details in the draft that needs a better set of eyes, so assigning @DaanHoogland @vishesh92 who're also PMC leads on the work.

potiuk · 2026-06-02T18:43:43Z

Thanks @DaanHoogland @yadvr @vishesh92 — agreed, let's make this (apache/cloudstack) the canonical project-level threat model and have the client/tooling repos inherit from it rather than each carrying a full copy.

Concretely, mirroring what we've done for other multi-repo PMCs:

apache/cloudstack/THREAT_MODEL.md is the single source of truth for the project-wide model: scope, trust boundaries, the management-server adversary model, in/out-of-scope classes, known non-findings, and triage dispositions.
The satellite repos (cloudstack-go, -cloudmonkey, -terraform-provider, -kubernetes-provider) get a short discoverability pointer — AGENTS.md → SECURITY.md → this model — plus, only where it adds something, a thin repo-specific addendum (e.g. the Go SDK's own input-trust surface) that references the parent instead of duplicating it.

So let's converge here first. None of the satellite PRs are merged, so re-pointing them to reference this model once its shape is settled is cheap — I'll repurpose those into pointer PRs (or close + reopen) once you're happy with the parent.

On "the fields we need": that's exactly the §14 "Open questions" section — each is a proposed answer for you to confirm, correct, or strike, grouped into waves so you can take a few at a time. Drop answers inline or here and I'll fold them in and promote the provenance tags. Happy to adjust the section set if CloudStack's shape calls for it.

…po copy Drop the standalone draft-THREAT-MODEL.md and wire the discoverability chain AGENTS.md -> SECURITY.md -> the project-wide model in apache/cloudstack (apache/cloudstack#13293), so scanners find one canonical model and this repo inherits it rather than duplicating it. Generated-by: Claude Code

vishesh92 · 2026-06-08T09:58:45Z

+**Q9.** Guest VM workloads — confirm that hypervisor-mediated side
+channels and resource-exhaustion-within-allocation are out of scope, and
+that the in-scope orchestration concerns are limited to "did CloudStack
+place the VM in the right VLAN / apply the right security group / route
+the right IP" (proposed)? *(maps to §3 item 5, §7, §9)*


This sound right.
IMO, the only scenario where it would be a cloudstack problem will be if cloudstack is setting wrong/bad settings while launching the guest VM or some other action which results in the corresponding issues with hypervisor. In this scenario, CloudStack needs to ensure it's using correct/secure settings for the hypervisor.
@DaanHoogland What do you think?

vishesh92 · 2026-06-08T10:03:29Z

+**Q11.** Confirm the unsupported-component list: `tools/marvin/`,
+`test/`, `developer/`, `quickcloud/`, `cloud-cli/`,
+`tools/{devcloud4,devcloud-kvm,appliance,checkstyle,transifex,bugs-wiki,...}`,
+`simulator` hypervisor plugin. Anything to add or remove? *(maps to §3
+item 7)*


@DaanHoogland do you think we need to include simulator & tools/appliance?

I think we need to exclude them and make that explicit. Later we might want to create tooling with the express purpose of checking security but let’s leave oit out of scope for now.

vishesh92 · 2026-06-08T10:12:46Z

+**Q18.** 2FA — proposed: off by default, operator turns it on per
+domain / per user via `enable.2fa.*`. Confirm; and is "2FA disabled in
+production" a §10 violation or a deployment choice? *(maps to §5a,
+§10)*


IMO, this is a deployment choice. The correct global settings for this are:
enable.user.2fa - default is false. Determines whether two factor authentication is enabled or not. This can also be configured at domain level.
mandate.user.2fa - default is false. Determines whether to make the two factor authentication mandatory or not. This setting is applicable only when enable.user.2fa is true. This can also be configured at domain level.

vishesh92 · 2026-06-08T10:22:39Z

+**Q20.** Integration API port `:8096` — proposed: closed (port-zero) by
+default in production packaging, open only when explicitly configured;
+when open, it is unauthenticated by design. A report of "integration
+port allows admin commands without auth" is `OUT-OF-MODEL:
+non-default-build` *if* the operator opened it, else `VALID`. Confirm
+the default. *(maps to §5a, §10, §11a)*


The default should be 0 (disabled). But I need to confirm this.
@DaanHoogland any idea about this?

yes, it is set to 0 and only in test configurations it is set to 8096.

potiuk · 2026-06-11T00:57:38Z

Thanks @vishesh92 — this is exactly the input the §14 questions were after; folding all of it in. How each lands:

Root admin (L959) — confirmed trusted operator with direct access anyway → stays out of the adversary model (§7); promoting (inferred) → (maintainer).
userdata (L970) — confirmed: the end user customizing their own guest OS → tenant-controlled data inside their own boundary, not a CloudStack-side injection surface (§6/§9).
Hypervisor settings (L965) — narrowing the in-model case to "CloudStack applies a wrong/insecure setting when launching/managing the guest" (a CloudStack responsibility); the hypervisor's own hardening stays out. Left flagged for @DaanHoogland.
proxy.header.verify (L984) — corrected: default false; only when the remote address is in proxy.cidr does CloudStack read proxy.header.names. Fixing §5a/§6.
2FA (L989) — confirmed deployment choice: enable.user.2fa (default false) + mandate.user.2fa (default false), domain-configurable → "2FA not enforced" is operator config, not a flaw (§10/§11a), with those exact knob names.
Password hashing (L1000) — the clincher for the §11a non-finding: default encode set excludes MD5,LDAP,PLAINTEXT (effective PBKDF2,SHA256SALT,SAML2), so md5/plaintext hashing of new passwords in a default install is out of model. I'll cite the exact default-vs-excluded set.

Two still on @DaanHoogland: L976 (whether simulator + tools/appliance are in scope — I'd lean out, dev/test paths) and L1007 (the default-0/disabled confirm).

I'll push the updated model with the confirmed items folded in. Thanks again — this is the review that makes it usable for triage.

Promotes six §14 questions from proposed to maintainer-confirmed per vishesh92's inline answers (apache#13293): - Q8 root admin: confirmed trusted operator (direct access anyway) -> OUT-OF-MODEL: equivalent-harm (§3 item 4, §7). - Q9 guest/hypervisor: side channels + in-allocation exhaustion out of scope; one in-model case is CloudStack applying wrong/insecure hypervisor settings (Daan to confirm boundary). - Q10 userdata: end-user guest-OS customization, tenant-controlled data in their own boundary, not a CloudStack injection surface. - Q17 proxy headers: proxy.header.verify default false; proxy.header.names read only when Remote_Addr in proxy.cidr. - Q18 2FA: corrected the stale setting names to the real ones - enable.user.2fa + mandate.user.2fa (both default false, domain- configurable); 2FA-off is a deployment choice, not a §10 violation. - Q19 password encoders: greenfield md5/plaintext hashing is OUT-OF-MODEL: non-default-build; effective set PBKDF2,SHA256SALT,SAML2. Clears the pending-Q18/Q19 notes in §10/§11 and updates the tally. L976 (simulator/tools-appliance scope) and L1007 stay open on Daan. Generated-by: Claude Opus 4.8 (1M context)

DaanHoogland · 2026-06-11T06:52:38Z

+**Q13.** Network-fabric assumptions — proposed: at least four logical
+networks (management, public, guest, storage), with the management
+network as the trusted control plane. Is that the canonical model, or
+do you support more compressed topologies (single-fabric) in production?
+*(maps to §5, §10)*


There are four logical networks that can each have multiple instances for use in different topologies, e.g. multiple zones. They can be combined in physical networks. All four types of the logical networks must be present for a functional system.

DaanHoogland · 2026-06-11T06:54:50Z

+**Q14.** Clock-skew assumption for signature v3 `expires` enforcement —
+proposed: operator's responsibility to keep client + management-server
+clocks roughly in sync. Confirm. *(maps to §5)*


yes, (reminder @vishesh92 , this might be one we need to add to the security model page: https://cloudstack.apache.org/security)

DaanHoogland · 2026-06-11T06:56:07Z

+**Q15.** Confirm the filesystem-permissions inventory for sensitive
+files: JCEKS keystore, Root CA private key, JaSypt key + IV,
+`db.properties`. Who owns them, what mode? *(maps to §5, §10)*


@potiuk , do you expect a csv type list of all files in a functioning system?

DaanHoogland · 2026-06-11T07:03:50Z

+**Q16.** Confirm the "what CloudStack does not do to its host" inventory
+in §5: no child processes besides agent `Script` invocations / system
+VM provisioning; signal-handlers via servlet container default;
+environment-variable consumption confined to documented set. Anything to
+add? *(maps to §5)*


DaanHoogland · 2026-06-11T07:10:27Z

+**Q21.** API request size cap and cluster/agent RPC payload size cap —
+are these explicitly bounded, or "whatever Jetty / NIO defaults give"?
+*(maps to §6, §9)*


the UI-server has an explicit request size: in org.apache.cloudstack.ServerDeamon; DEFAULT_REQUEST_CONTENT_SIZE = 1048576. For other components the sizes are capped by the upstream components used.

DaanHoogland · 2026-06-11T07:16:52Z

+**Q22.** `api.throttling.*` and per-account resource limits — proposed:
+these are the entire DoS-protection surface, with no engine-level
+guard. Confirm. *(maps to §6, §9, §10)*


confirmed, this is processed at API access check. (default of api.throttling.enabled == false!!)

DaanHoogland · 2026-06-11T07:18:06Z

+**Q23.** Decompression behaviour on uploaded QCOW2 / RAW / OVA — proposed:
+no engine-side cap; per-account storage limits + hypervisor limits are
+the bound. Confirm. *(maps to §6, §9)*


DaanHoogland · 2026-06-11T07:19:56Z

+**Q24.** Same-host non-`cloudstack` UID — proposed: game-over, no defence
+claimed. Confirm. *(maps to §7, §9)*


@vishesh92 , I think there is a refusal to add a host with the same IP, does this include a UID check as well (or should it)?

DaanHoogland · 2026-06-11T07:21:33Z

+**Q25.** Side-channel observers (cache, branch, hypervisor-shared) — out
+of scope (proposed). *(maps to §7, §9)*


@potiuk I agree with cache and hypervisor-shared, (if I understand them correctly) but I do not understand “branch” in this context. Can you explain?

DaanHoogland · 2026-06-11T07:27:59Z

+**Q26.** Byzantine-internal-peer threshold — confirm CloudStack makes no
+BFT claim, so any compromised cluster peer or agent with a valid
+Root-CA-issued cert is unbounded (proposed). *(maps to §7, §9)*


Agreed. @vishesh92 we might want to add some issues/feature proposals in this area. This will only work in larger clusters, not in single or dual machine clusters (if I understand the byzantine model correctly).

DaanHoogland · 2026-06-11T07:41:45Z

+**Q27.** §8 P9 memory-safety — JVM-bounded; is the reachability
+boundary correctly "in-model for the JSON API + B5 input; out-of-model
+for native hypervisor SDK bugs that surface as `Throwable`"? *(maps to
+§8 P9, §9)*


§8 P9 says that "CloudStack's own server-side code is Java”, implying it is only java. This is not correct. No limitation on implementation languages is presumed. Claims about the JVM are correct.

For instance ocaml and python code can run on hypervisors, as well as bash, and go is used on the management server. This list may be not complete now or in the future.

DaanHoogland · 2026-06-11T07:44:03Z

+**Q28.** §8 P10 listing-scope — confirm the §10 invariant "`list*`
+responses are scoped to the principal's domain/account/project". And:
+is information leak via error messages / async-job status / event log
+an in-model concern, or accepted? *(maps to §8 P10, §9, §11)*


Regular system logs (log4j for instance) are exempt. Other than these, all information leaks are a concern.

DaanHoogland · 2026-06-11T07:44:55Z

+**Q29.** Data-at-rest encryption — confirm CloudStack delegates entirely
+to storage layer / hypervisor (LUKS, Ceph encryption, vSphere VM
+Encryption); no CloudStack-layer encryption of guest volumes. *(maps to
+§9)*


correct (@vishesh92 , please confirm)

DaanHoogland · 2026-06-11T07:57:44Z

+**Q30.** Constant-time comparison — confirm that *only* the API
+signature path uses `ConstantTimeComparator`. Login password compare,
+session cookie compare, console-token compare — none documented
+constant-time. Is that intentional? *(maps to §8, §9)*


I do not understand; Is the fact that it is not documented intentional?
If this is indeed the question, than yes, this is a lack of feature. (pretty sure I am missing the point, cc @vishesh92 )

DaanHoogland · 2026-06-11T07:59:33Z

+**Q31.** Time-of-check-to-time-of-use between RBAC check at API entry
+and orchestration on agent fleet — confirm mid-job RBAC revocation is
+**not** retroactively enforced (proposed). *(maps to §9)*


agreed/confirmed

DaanHoogland · 2026-06-11T08:01:36Z

+**Q32.** TLS posture on `:8080` vs `:8443` — confirm production deploys
+behind TLS on `:8443` or behind a TLS-terminating reverse proxy; a bare
+`:8080` HTTP API is dev-only. *(maps to §5a, §10)*


DaanHoogland · 2026-06-11T08:02:51Z

+**Q33.** `security.encryption.key` reuse across environments — confirm
+that reusing the JaSypt key + IV across staging and production is a
+documented misuse. *(maps to §11)*


DaanHoogland · 2026-06-11T08:05:35Z

+**Q34.** Should this document live at `docs/threat-model.md` in
+`apache/cloudstack`, or as a page on `cloudstack.apache.org/security/`?
+Or both, with one canonical and the other linked? *(meta)*


in my not so humble opinion: cloudstack.apache.org/security should contain an excerpt of, and a link to threat-model.md, the later being the source of truth. @vishesh92 ?

DaanHoogland · 2026-06-11T08:07:39Z

+**Q35.** Is there an existing CloudStack threat-model document
+(Confluence, internal, or a `[SECURITY]`-tagged dev@ thread) that this
+should reconcile against rather than supersede? *(meta — §3.1a of the
+rubric)*


cloudstack.apache.org/security/ is the only security model at this moment and this should enhance this by providing it with a source of truth.

DaanHoogland · 2026-06-11T08:11:17Z

+**Q36.** What kind of change should trigger a revision (proposed list in
+§12 — confirm or correct)? *(meta, §12)*


One that I would add is a change in the extension mechanisms implemented by CloudStack.

DaanHoogland · 2026-06-11T08:14:16Z

+**Q38.** Confirm the structural decision to keep the four satellite repos
+as separate delta models (`cloudstack-go-threat-model-draft.md`,
+`cloudstack-cloudmonkey-threat-model-draft.md`,
+`cloudstack-terraform-provider-threat-model-draft.md`,
+`cloudstack-kubernetes-provider-threat-model-draft.md`) inheriting §3
+/ §4 / §7 from this document. *(meta, §3 item 9)*


confirmed, these are not the system core. They can not be used without the core but the core can be used without them. There is in fact an added hierarchy to the repos in that cloudstack-go is a dependency to the other three.

Fix lint failures flagged on draft-threat-model PR

f792725

Markdown / typos / table-shape fixes per the CI lint output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

yadvr requested review from DaanHoogland and vishesh92 June 1, 2026 07:16

DaanHoogland reviewed Jun 2, 2026

View reviewed changes