[NV] Use Marlin for MiniMax M3 TP-only configs by jasonlizhengjian · Pull Request #1809 · SemiAnalysisAI/InferenceX

jasonlizhengjian · 2026-06-16T15:51:26Z

Stacked on #1781 and #1784.

Adds --moe-backend marlin for MiniMax-M3 B200/B300 TP-only vLLM launch paths when expert parallelism is disabled.

Validation:

bash -n benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b200.sh benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b300.sh benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b200_mtp.sh benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b300_mtp.sh
git diff --check
PyYAML parse for perf-changelog.yaml and .github/configs/nvidia-master.yaml

Note

Low Risk
Benchmark-only vLLM serve flag change on a narrow parallelism branch; no auth, data, or production runtime impact.

Overview
For MiniMax-M3 MXFP8 B200/B300 fixed-sequence vLLM recipes (standard and EAGLE3 MTP), the TP-only launch path now passes --moe-backend marlin when expert parallelism is off (EP_SIZE ≤ 1 and DP attention is false). DP-attention and TP+EP branches are unchanged.

perf-changelog.yaml records this for minimaxm3-fp8-b200-vllm, minimaxm3-fp8-b300-vllm, and the matching MTP config keys.

^{Reviewed by Cursor Bugbot for commit 0ddf2cd. Bugbot is set up for automated code reviews on this repo. Configure here.}

…25-params' into nv/jasonli/minimaxm3-stack-base-1781-1784 # Conflicts: # perf-changelog.yaml

…tp-serving-settings' into nv/jasonli/minimaxm3-stack-base-1781-1784 # Conflicts: # perf-changelog.yaml

github-actions · 2026-06-16T15:51:41Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

functionstackx · 2026-06-16T18:54:42Z

  PARALLEL_ARGS="--tensor-parallel-size=$TP --enable-expert-parallel"
 else
-  PARALLEL_ARGS="--tensor-parallel-size=$TP"
+  PARALLEL_ARGS="--tensor-parallel-size=$TP --moe-backend marlin"


thanks for contribution! can u add marlin for blackwell marlin on vllm recipes?

github-actions · 2026-06-16T19:20:08Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=27630519240
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=27630519240

jasonlizhengjian · 2026-06-16T19:36:13Z

@functionstackx can you merge? thanks

functionstackx · 2026-06-16T19:41:56Z

happy to once there is an PR to vllm recipes repo with the recipes changes per #1809 (comment)

jasonlizhengjian · 2026-06-16T20:08:15Z

@functionstackx PR to vllm recipe: vllm-project/recipes#558

functionstackx · 2026-06-16T20:30:52Z

/reuse-sweep-run

jasonlizhengjian added 8 commits June 15, 2026 12:03

Update MiniMax M3 B300 vLLM serving settings

4514277

Update MiniMax M3 B300 changelog PR link

1546aeb

Update MiniMax M3 B200 B300 MTP settings

16ccdff

Update MiniMax M3 MTP changelog link

389037e

Merge remote-tracking branch 'origin/nv/jasonli/minimaxm3-b300-vllm-m…

c3550d5

…25-params' into nv/jasonli/minimaxm3-stack-base-1781-1784 # Conflicts: # perf-changelog.yaml

Merge remote-tracking branch 'origin/nv/jasonli/minimaxm3-b200-b300-m…

322e972

…tp-serving-settings' into nv/jasonli/minimaxm3-stack-base-1781-1784 # Conflicts: # perf-changelog.yaml

Use Marlin for MiniMax M3 TP-only configs

161adc0

Update MiniMax M3 Marlin changelog link

02f6f47

jasonlizhengjian requested a review from a team June 16, 2026 15:51

jasonlizhengjian requested review from jgangani and kedarpotdar-nv as code owners June 16, 2026 15:51

github-project-automation Bot added this to InferenceMAX Board Jun 16, 2026

Update MiniMax M3 Marlin changelog link

32c5fe0

jasonlizhengjian added the full-sweep-enabled label Jun 16, 2026 — with ChatGPT Codex Connector

Preserve append-only changelog

a580725

jasonlizhengjian changed the title ~~[WIP][NV] Use Marlin for MiniMax M3 TP-only configs~~ [NV] Use Marlin for MiniMax M3 TP-only configs Jun 16, 2026

functionstackx reviewed Jun 16, 2026

View reviewed changes

Merge branch 'main' into nv/jasonli/minimaxm3-marlin-tp-only

0ddf2cd

functionstackx merged commit d99c824 into main Jun 16, 2026
4 checks passed

functionstackx deleted the nv/jasonli/minimaxm3-marlin-tp-only branch June 16, 2026 20:31

github-project-automation Bot moved this to Done in InferenceMAX Board Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NV] Use Marlin for MiniMax M3 TP-only configs#1809

[NV] Use Marlin for MiniMax M3 TP-only configs#1809
functionstackx merged 11 commits into
mainfrom
nv/jasonli/minimaxm3-marlin-tp-only

jasonlizhengjian commented Jun 16, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

functionstackx Jun 16, 2026

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

jasonlizhengjian commented Jun 16, 2026

Uh oh!

functionstackx commented Jun 16, 2026

Uh oh!

jasonlizhengjian commented Jun 16, 2026

Uh oh!

functionstackx commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jasonlizhengjian commented Jun 16, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

functionstackx Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 16, 2026

Uh oh!

jasonlizhengjian commented Jun 16, 2026

Uh oh!

functionstackx commented Jun 16, 2026

Uh oh!

jasonlizhengjian commented Jun 16, 2026

Uh oh!

functionstackx commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jasonlizhengjian commented Jun 16, 2026 •

edited by cursor Bot

Loading