fix: include boundary token in top-p nucleus sampling by zhuxiaoxuhit · Pull Request #1171 · fishaudio/fish-speech

zhuxiaoxuhit · 2026-03-11T09:20:14Z

The current implementation uses cum_probs > top_p to build the removal mask,
which marks the token that first pushes the cumulative probability over top_p
as removed. This means the actual nucleus covers less probability mass than
intended.

Example with top_p=0.9 and probs [0.50, 0.35, 0.10, 0.05]:
cum_probs = [0.50, 0.85, 0.95, 1.00]
current: keeps tokens at 0.50+0.35=0.85 (under-samples the nucleus)
correct: keeps tokens at 0.50+0.35+0.10=0.95 (covers top_p)

The fix shifts the removal mask right by one position before applying it,
so the token that first crosses the threshold is included. This matches
the standard implementation in transformers (TopPLogitsWarper) and other
LLM inference libraries.

nucleus sampling with cum_probs > top_p excludes the token that first pushes the cumulative probability over the threshold, so the actual nucleus covers less than top_p of the probability mass. shift the mask right by one to include that boundary token, matching the standard implementation in transformers and other LLM toolkits.

Stardust-minus · 2026-03-23T08:16:25Z

This fix is still needed, but the PR has merge conflicts with the current main branch (logits_to_probs has been refactored). Could you please rebase onto the latest main? Thanks!

zhuxiaoxuhit · 2026-03-23T09:27:24Z

Hi @Stardust-minus , the merge conflicts have been resolved. Please review when you have a minute. Thanks!

github-actions · 2026-04-23T00:47:11Z

This PR is stale because it has been open for 30 days with no activity.

Merge branch 'main' into fix/top-p-nucleus-sampling

1ffc56a

github-actions Bot added the stale label Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: include boundary token in top-p nucleus sampling#1171

fix: include boundary token in top-p nucleus sampling#1171
zhuxiaoxuhit wants to merge 2 commits intofishaudio:mainfrom
zhuxiaoxuhit:fix/top-p-nucleus-sampling

zhuxiaoxuhit commented Mar 11, 2026

Uh oh!

Stardust-minus commented Mar 23, 2026

Uh oh!

zhuxiaoxuhit commented Mar 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhuxiaoxuhit commented Mar 11, 2026

Uh oh!

Stardust-minus commented Mar 23, 2026

Uh oh!

zhuxiaoxuhit commented Mar 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants