support gemma4 use_hybrid_sliding by Jintao-Huang · Pull Request #129 · modelscope/mcore-bridge

Jintao-Huang · 2026-06-18T08:20:54Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces a hybrid attention mode for Gemma4 in multimodal scenarios, allowing full attention layers to fallback to unfused attention with arbitrary masks while sliding layers utilize the configured backend with native sliding windows. The feedback recommends safely accessing the attention backend name using getattr to prevent potential AttributeErrors. Additionally, it is suggested to replace the mm_token_type_ids is None check with a more robust has_vision flag to correctly handle text-only batches that contain zero-valued token type tensors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-18T08:22:14Z

+            # In hybrid mode (backend != unfused), only full attention layers need arbitrary mask;
+            # sliding layers keep causal mask and use native flash sliding window.
+            # When backend is unfused, all layers use arbitrary mask.
+            if config.attention_backend.name == 'unfused' or not self.is_sliding:


To prevent potential AttributeError if config.attention_backend is None or does not have a name attribute (e.g., in certain testing or custom configurations), use getattr to safely retrieve the backend name.

Suggested change

if config.attention_backend.name == 'unfused' or not self.is_sliding:

if getattr(config.attention_backend, 'name', None) == 'unfused' or not self.is_sliding:

gemini-code-assist · 2026-06-18T08:22:14Z

-                    'errors in multimodal scenarios. Please always pass pure text data.')
-                text_config.use_bidirectional_attention = None
-            else:
+            if config.attention_backend.name == 'unfused':


Safely access the attention_backend name using getattr to avoid potential AttributeError if the backend is not configured or is None.

Suggested change

if config.attention_backend.name == 'unfused':

if getattr(config.attention_backend, 'name', None) == 'unfused':

gemini-code-assist · 2026-06-18T08:22:14Z

+        self.hybrid_attention_mode = (
+            text_config.use_bidirectional_attention == 'vision' and config.attention_backend.name != 'unfused')


Safely access the attention_backend name using getattr to avoid potential AttributeError if the backend is not configured or is None.

self.hybrid_attention_mode = (\n text_config.use_bidirectional_attention == 'vision' and getattr(config.attention_backend, 'name', None) != 'unfused')

gemini-code-assist · 2026-06-18T08:22:14Z

-        sliding_attention = sliding_attention | window_mask
+        # In hybrid mode, only skip sliding mask when no vision tokens (pure text batch).
+        # When vision tokens are present, fall back to all-unfused behavior for correctness.
+        use_hybrid_sliding = self.hybrid_attention_mode and mm_token_type_ids is None


If mm_token_type_ids is not None but contains no vision tokens (e.g., a tensor of all zeros, which is common for text-only batches in multimodal pipelines), checking mm_token_type_ids is None will incorrectly evaluate to False. This disables hybrid sliding and falls back to the slower manual sliding window mask. We should check if there are actually any vision tokens present in the batch.

has_vision = mm_token_type_ids is not None and (mm_token_type_ids > 0).any()\n use_hybrid_sliding = self.hybrid_attention_mode and not has_vision

gemini-code-assist · 2026-06-18T08:22:14Z

        attention_mask['sliding_attention'] = sliding_attention
        attention_mask['full_attention'] = full_attention
+        # Signal sliding layers to override attn_mask_type to arbitrary in hybrid mode with vision
+        attention_mask['_sliding_override_arbitrary'] = self.hybrid_attention_mode and mm_token_type_ids is not None


Use the has_vision boolean flag defined earlier to avoid overriding the attention mask to arbitrary when there are no actual vision tokens in the batch.

Suggested change

attention_mask['_sliding_override_arbitrary'] = self.hybrid_attention_mode and mm_token_type_ids is not None

attention_mask['_sliding_override_arbitrary'] = self.hybrid_attention_mode and has_vision

support gemma4 use_hybrid_sliding

731d56a

gemini-code-assist Bot reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support gemma4 use_hybrid_sliding#129

support gemma4 use_hybrid_sliding#129
Jintao-Huang wants to merge 1 commit into
modelscope:mainfrom
Jintao-Huang:support_gemma4_use_hybrid_sliding

Jintao-Huang commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if config.attention_backend.name == 'unfused' or not self.is_sliding:
	if getattr(config.attention_backend, 'name', None) == 'unfused' or not self.is_sliding:

		self.hybrid_attention_mode = (
		text_config.use_bidirectional_attention == 'vision' and config.attention_backend.name != 'unfused')

	attention_mask['_sliding_override_arbitrary'] = self.hybrid_attention_mode and mm_token_type_ids is not None
	attention_mask['_sliding_override_arbitrary'] = self.hybrid_attention_mode and has_vision

Conversation

Jintao-Huang commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant