Skip to content

[NV] Use Marlin for MiniMax M3 TP-only configs#1809

Merged
functionstackx merged 11 commits into
mainfrom
nv/jasonli/minimaxm3-marlin-tp-only
Jun 16, 2026
Merged

[NV] Use Marlin for MiniMax M3 TP-only configs#1809
functionstackx merged 11 commits into
mainfrom
nv/jasonli/minimaxm3-marlin-tp-only

Conversation

@jasonlizhengjian

@jasonlizhengjian jasonlizhengjian commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Stacked on #1781 and #1784.

Adds --moe-backend marlin for MiniMax-M3 B200/B300 TP-only vLLM launch paths when expert parallelism is disabled.

Validation:

  • bash -n benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b200.sh benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b300.sh benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b200_mtp.sh benchmarks/single_node/fixed_seq_len/minimaxm3_fp8_b300_mtp.sh
  • git diff --check
  • PyYAML parse for perf-changelog.yaml and .github/configs/nvidia-master.yaml

Note

Low Risk
Benchmark-only vLLM serve flag change on a narrow parallelism branch; no auth, data, or production runtime impact.

Overview
For MiniMax-M3 MXFP8 B200/B300 fixed-sequence vLLM recipes (standard and EAGLE3 MTP), the TP-only launch path now passes --moe-backend marlin when expert parallelism is off (EP_SIZE ≤ 1 and DP attention is false). DP-attention and TP+EP branches are unchanged.

perf-changelog.yaml records this for minimaxm3-fp8-b200-vllm, minimaxm3-fp8-b300-vllm, and the matching MTP config keys.

Reviewed by Cursor Bugbot for commit 0ddf2cd. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@jasonlizhengjian jasonlizhengjian changed the title [WIP][NV] Use Marlin for MiniMax M3 TP-only configs [NV] Use Marlin for MiniMax M3 TP-only configs Jun 16, 2026
PARALLEL_ARGS="--tensor-parallel-size=$TP --enable-expert-parallel"
else
PARALLEL_ARGS="--tensor-parallel-size=$TP"
PARALLEL_ARGS="--tensor-parallel-size=$TP --moe-backend marlin"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for contribution! can u add marlin for blackwell marlin on vllm recipes?

@github-actions

Copy link
Copy Markdown
Contributor

@jasonlizhengjian

Copy link
Copy Markdown
Collaborator Author

@functionstackx can you merge? thanks

@functionstackx

Copy link
Copy Markdown
Collaborator

happy to once there is an PR to vllm recipes repo with the recipes changes per #1809 (comment)

@jasonlizhengjian

Copy link
Copy Markdown
Collaborator Author

@functionstackx PR to vllm recipe: vllm-project/recipes#558

@functionstackx

Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit d99c824 into main Jun 16, 2026
4 checks passed
@functionstackx functionstackx deleted the nv/jasonli/minimaxm3-marlin-tp-only branch June 16, 2026 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants