Skip to content

Conversation

@wenhuach21
Copy link
Contributor

Description

Please briefly describe your main changes, the motivation.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Copilot AI review requested due to automatic review settings February 4, 2026 06:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses test and code improvements by commenting out multiple test cases and adding new device management functionality.

Changes:

  • Commented out extensive test cases in the scheme test file to focus on a single test_set_scheme test
  • Added a new dispatch_model_block_wise utility function for multi-device model dispatching
  • Updated evaluation code to use the new device dispatch mechanism
  • Changed a logging level from warning to info and added tie_weights() calls in multiple locations

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/test_cpu/schemes/test_scheme.py Commented out multiple test cases, leaving only test_set_scheme active
auto_round/utils/device.py Added new dispatch_model_block_wise function for block-wise model dispatching across devices
auto_round/eval/evaluation.py Updated prepare_model_for_eval to use new device dispatch utility and changed parameter name
auto_round/compressors/base.py Changed logger level from warning to info, added tie_weights() call, and modified error raising
auto_round/auto_scheme/utils.py Added tie_weights() call before device map inference

@wenhuach21 wenhuach21 changed the title Fix 0204 support multiple device evaluation for activation quantized model Feb 4, 2026
wenhuach21 and others added 11 commits February 4, 2026 15:14
…ix_0204

# Conflicts:
#	auto_round/eval/eval_cli.py
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
# temporary tensors, other processes, and allocator fragmentation, reducing
# the chance of runtime OOM while still utilizing most available memory.
new_max_memory[device] = max_memory[device] * max_mem_ratio
new_max_memory = get_balanced_memory(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my memory, this will use all CUDA_VISIBLE_DEVICES. Setting CUDA_VISIBLE_DEVICES with device_map information might help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants