Skip to content

Use Opus and environment variables for model selection#528

Open
hanna-paasivirta wants to merge 4 commits into
mainfrom
chat-model-env
Open

Use Opus and environment variables for model selection#528
hanna-paasivirta wants to merge 4 commits into
mainfrom
chat-model-env

Conversation

@hanna-paasivirta

@hanna-paasivirta hanna-paasivirta commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Short Description

Fable was retired, which broke every chat service that pointed at it. This moves the main chat model off Fable and makes it configurable. Model selection now lives in one place, and optional env vars let us change the live model without a redeploy.

Fixes #533 and #534

Implementation Details

  • services/models.py owns the whole model story: a default (Opus), a per-service map, and preferred_chat_model(service).
  • Each service resolves its model from its own env var if set, otherwise its code default, otherwise the global default. There is one env var per service, no catch-all.
  • workflow_chat defaults to Sonnet, not Opus. It forces JSON/YAML output through structured outputs, and Opus handles that worse than Sonnet right now. The other services default to Opus.
  • The service yamls no longer set a model. They point a comment at models.py.
  • Optional env vars, one per agent: APOLLO_GLOBAL_CHAT_MODEL (planner), APOLLO_WORKFLOW_CHAT_MODEL, APOLLO_JOB_CHAT_MODEL. Unset by default. doc_agent has no var and runs on the default.
  • Tests for models.py live in a new services/tests/unit/ directory, since models.py is a shared root module not owned by any one service.

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

  • Code generation (copilot but not intellisense)
  • Learning or fact checking
  • Strategy / design
  • Optimisation / refactoring
  • Translation / spellchecking / doc gen
  • Other
  • I have not used AI

You can read more details in our Responsible AI Policy

@hanna-paasivirta

hanna-paasivirta commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

I'm seeing the garbled model outputs triggered by structured outputs in workflow_chat. I've never seen these before in that service with Sonnet, but saw them in job_chat and fixed them by adding a code edit tool and limiting the use of structured outputs to that only. In order to switch to Opus safely, I may need to make a similar architecture change to workflow_chat. To keep workflow_chat on Sonnet, I'll make the model setting more fine-grained and set it by service instead.

@hanna-paasivirta hanna-paasivirta changed the title Use environment variables for model selection Use Opus and environment variables for model selection Jun 15, 2026
@hanna-paasivirta hanna-paasivirta marked this pull request as ready for review June 15, 2026 18:54

@josephjclark josephjclark left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @hanna-paasivirta - I just wanted to post some initial impressions. Unfortunately this project really doesn't sit well with me, and seeing the implementation makes me very nervous.

I'm only half-way through the review and need to give it some more time, but I wanted to post where I'd got to so far.

Also - despite very heavy AI commenting there's no readme documentation about how this works (just some very spurious looking env var defaults?). We should correct that because the relationship between envs and config is murky

@@ -1,5 +1,6 @@
config_version: 1.0
model: claude-fable
# The chat model is configured in services/models.py (the default; doc_agent has

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment doesn't make sense in isolation: it only makes sense if you know that the model used to be set in config. It's confusing and we should remove it. Same for the other models.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually love it 😬 Intuitively it feels that this config should be defaults for all values, and env vars can be used to override it(somehow, it's not entirely practical!)

The split now of some things being envs and some things being config values feels confusing, rigid and arbitrary

Comment thread services/models.py
"""Resolve the main chat model for `service`.

Precedence: the service's env var if set, else its per-service default, else
CHAT_MODEL_DEFAULT. Each service's env var (e.g. APOLLO_WORKFLOW_CHAT_MODEL)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these comments are so so verbose. I think I need to start pushing back on them. The lightning codebase is probably more comment than code now.

Anyway this second sentence I don't like. It's repetitive, plus the "we can switch models without redeploying" thing is misleading. To change an env var you have to configure kubernetes and then restart the service.

It would be more accurate to say you can update it without a rebuild. But I wouldn't even say that at this level.

Comment thread .env.example
# redeploying. Accepts an alias (claude-opus, claude-sonnet) or a full model ID.
# APOLLO_GLOBAL_CHAT_MODEL= # global_chat planner
# APOLLO_WORKFLOW_CHAT_MODEL= # workflow_chat
# APOLLO_JOB_CHAT_MODEL= # job_chat

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These sample env vars don't make sense do that? What does job_chat resolve to?

@josephjclark

Copy link
Copy Markdown
Collaborator

@hanna-paasivirta what I think makes more sense here is:

a) to hard-code each service to a particular model, as we do on main
b) to use env vars to drive the version of each model

So you'd have an env var OPUS_VERSION with a value of 4-8.

Then models.py would have a function like getModelVersion(name) or something. And job_chat calls getModelVersion('opus'), which returns claude-opus-4-8 (where the version suffix comes from env or a default).

The model name can still come from the config.yaml file for each service, which is where any service specific stuff lives. But it would only have a model name, not a version.

Basically this means that the env only drives the version number, not the model itself. Otherwise the code is much as it is on main right now, where the service itself makes the big decisions about which model to use, and the env just bumps the version to keep it modern.

As I've said on slack: this architecture would not have helped us with the fable switch-off: fable support needed more than just a version string, and a dynamic downgrade likely needs more to. If we want to be robust to models disappearing overnight (a very worrying precedent) we should put some thought into be better rollback solutions (I'm aware there's another PR open for this)

@hanna-paasivirta

hanna-paasivirta commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

As I've said on slack: this architecture would not have helped us with the fable switch-off: fable support needed more than just a version string, and a dynamic downgrade likely needs more to. If we want to be robust to models disappearing overnight (a very worrying precedent) we should put some thought into be better rollback solutions (I'm aware there's another PR open for this)

We needed more because we were upgrading the model to a new one. The added protections work ok for existing models. But there's little guarantee that we can always downgrade the model easily in the future like we can now with fable/opus/sonnet

@hanna-paasivirta

Copy link
Copy Markdown
Contributor Author

Basically this means that the env only drives the version number, not the model itself.

I think there has never been a situation where the pointer to a model version (without specifying a snapshot) would stop work working and require an intervention on our side. But to be fair a model family had never been taken down before either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade model to Opus

2 participants