-
Notifications
You must be signed in to change notification settings - Fork 685
[BugFix] Fix token_penalty kernel #6069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[BugFix] Fix token_penalty kernel #6069
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
这个PR修复了token penalty kernel中的一个关键bug。在原代码中,ban_bad_words内核错误地将所有批次的查询都访问batch 0的bad_tokens数据,而不是使用各自批次的bad_tokens。修复方法是在内核中正确计算每个批次的bad_tokens指针偏移量。
Changes:
- 修复了
ban_bad_words内核函数,使其正确访问各批次对应的bad_tokens数据 - 统一了代码格式,将缩进从4空格改为2空格(符合项目的clang-format配置)
- 将参数名从
bad_words_list更改为bad_tokens以保持命名一致性
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| custom_ops/gpu_ops/token_penalty_multi_scores.cu | 修复了ban_bad_words内核中的批次索引bug,并统一代码格式 |
| custom_ops/gpu_ops/speculate_decoding/speculate_get_token_penalty_multi_scores.cu | 在speculative decoding版本中应用了相同的bug修复 |
| const int bi = blockIdx.x; | ||
| int tid = threadIdx.x; | ||
| T *logits_now = logits + bi * vocab_size; | ||
| const int64_t bad_tokens_now = bad_tokens + bi * bad_words_len; |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
类型错误:bad_tokens_now应该声明为const int64_t *(指针类型),而不是int64_t(整数类型)。这会导致编译错误,因为后续代码试图将其作为数组访问(bad_tokens_now[i])。正确的声明应该是:const int64_t *bad_tokens_now = bad_tokens + bi * bad_words_len;
| const int64_t bad_tokens_now = bad_tokens + bi * bad_words_len; | |
| const int64_t *bad_tokens_now = bad_tokens + bi * bad_words_len; |
| if (bi >= bs) return; | ||
| int tid = threadIdx.x; | ||
| T *logits_now = logits + token_idx * length; | ||
| const int64_t *bad_tokens_now = bad_tokens + bi * bad_words_length; |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件中的实现是正确的(使用了指针类型),与另一个文件(token_penalty_multi_scores.cu line 137)的实现不一致。请确保两个文件中的实现保持一致,都使用正确的指针类型声明。
760c41c to
a423edd
Compare
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.