Use Opus and environment variables for model selection#528
Use Opus and environment variables for model selection#528hanna-paasivirta wants to merge 4 commits into
Conversation
|
I'm seeing the garbled model outputs triggered by structured outputs in |
josephjclark
left a comment
There was a problem hiding this comment.
Hi @hanna-paasivirta - I just wanted to post some initial impressions. Unfortunately this project really doesn't sit well with me, and seeing the implementation makes me very nervous.
I'm only half-way through the review and need to give it some more time, but I wanted to post where I'd got to so far.
Also - despite very heavy AI commenting there's no readme documentation about how this works (just some very spurious looking env var defaults?). We should correct that because the relationship between envs and config is murky
| @@ -1,5 +1,6 @@ | |||
| config_version: 1.0 | |||
| model: claude-fable | |||
| # The chat model is configured in services/models.py (the default; doc_agent has | |||
There was a problem hiding this comment.
this comment doesn't make sense in isolation: it only makes sense if you know that the model used to be set in config. It's confusing and we should remove it. Same for the other models.
There was a problem hiding this comment.
I don't actually love it 😬 Intuitively it feels that this config should be defaults for all values, and env vars can be used to override it(somehow, it's not entirely practical!)
The split now of some things being envs and some things being config values feels confusing, rigid and arbitrary
| """Resolve the main chat model for `service`. | ||
|
|
||
| Precedence: the service's env var if set, else its per-service default, else | ||
| CHAT_MODEL_DEFAULT. Each service's env var (e.g. APOLLO_WORKFLOW_CHAT_MODEL) |
There was a problem hiding this comment.
these comments are so so verbose. I think I need to start pushing back on them. The lightning codebase is probably more comment than code now.
Anyway this second sentence I don't like. It's repetitive, plus the "we can switch models without redeploying" thing is misleading. To change an env var you have to configure kubernetes and then restart the service.
It would be more accurate to say you can update it without a rebuild. But I wouldn't even say that at this level.
| # redeploying. Accepts an alias (claude-opus, claude-sonnet) or a full model ID. | ||
| # APOLLO_GLOBAL_CHAT_MODEL= # global_chat planner | ||
| # APOLLO_WORKFLOW_CHAT_MODEL= # workflow_chat | ||
| # APOLLO_JOB_CHAT_MODEL= # job_chat |
There was a problem hiding this comment.
These sample env vars don't make sense do that? What does job_chat resolve to?
|
@hanna-paasivirta what I think makes more sense here is: a) to hard-code each service to a particular model, as we do on main So you'd have an env var Then The model name can still come from the config.yaml file for each service, which is where any service specific stuff lives. But it would only have a model name, not a version. Basically this means that the env only drives the version number, not the model itself. Otherwise the code is much as it is on main right now, where the service itself makes the big decisions about which model to use, and the env just bumps the version to keep it modern. As I've said on slack: this architecture would not have helped us with the fable switch-off: fable support needed more than just a version string, and a dynamic downgrade likely needs more to. If we want to be robust to models disappearing overnight (a very worrying precedent) we should put some thought into be better rollback solutions (I'm aware there's another PR open for this) |
We needed more because we were upgrading the model to a new one. The added protections work ok for existing models. But there's little guarantee that we can always downgrade the model easily in the future like we can now with fable/opus/sonnet |
I think there has never been a situation where the pointer to a model version (without specifying a snapshot) would stop work working and require an intervention on our side. But to be fair a model family had never been taken down before either. |
Short Description
Fable was retired, which broke every chat service that pointed at it. This moves the main chat model off Fable and makes it configurable. Model selection now lives in one place, and optional env vars let us change the live model without a redeploy.
Fixes #533 and #534
Implementation Details
services/models.pyowns the whole model story: a default (Opus), a per-service map, andpreferred_chat_model(service).models.py.APOLLO_GLOBAL_CHAT_MODEL(planner),APOLLO_WORKFLOW_CHAT_MODEL,APOLLO_JOB_CHAT_MODEL. Unset by default. doc_agent has no var and runs on the default.AI Usage
Please disclose how you've used AI in this work (it's cool, we just want to know!):
You can read more details in our Responsible AI Policy