Skip to content

fix: reimplement BertModelWarper utility methods for transformers v5 compatibility#113

Open
djdarcy wants to merge 1 commit intostoryicon:mainfrom
djdarcy:fix/transformers-v5-get-head-mask
Open

fix: reimplement BertModelWarper utility methods for transformers v5 compatibility#113
djdarcy wants to merge 1 commit intostoryicon:mainfrom
djdarcy:fix/transformers-v5-get-head-mask

Conversation

@djdarcy
Copy link
Copy Markdown

@djdarcy djdarcy commented Apr 2, 2026

TLDR version:

  • Fixes GroundingDINO model loading crash with transformers >= 5.0
  • Reimplements three utility methods (get_head_mask, get_extended_attention_mask, invert_attention_mask) as local class methods in BertModelWarper instead of copying them as bound references from the BertModel instance

Problem

BertModelWarper.__init__ previously copied three methods directly from the BertModel instance:

self.get_extended_attention_mask = bert_model.get_extended_attention_mask
self.invert_attention_mask = bert_model.invert_attention_mask
self.get_head_mask = bert_model.get_head_mask

Transformers v5.0 broke this pattern in multiple ways:

  1. get_head_mask was removed entirely (huggingface/transformers PR #41076), causing AttributeError: 'BertModel' object has no attribute 'get_head_mask'
  2. get_extended_attention_mask changed its 3rd parameter from device to dtype, causing TypeError: to() received an invalid combination of arguments - got (dtype=torch.device, )
  3. invert_attention_mask still works today but uses the same fragile bound-method pattern and references self.dtype internally

Fix

All three methods are reimplemented as proper class methods on BertModelWarper. The logic is identical to the original transformers implementations -- these are simple mask reshaping and dtype casting operations with no model-state dependencies beyond self.config.

This makes BertModelWarper fully self-contained and independent of the transformers library version.

Some basic testing done

  • Tested with transformers==5.3.0 -- GroundingDINO model loads and runs inference successfully
  • The reimplemented methods produce the same mask shapes and values as the originals

Closes #110
Closes #111

…compatibility

BertModelWarper previously copied get_head_mask, get_extended_attention_mask,
and invert_attention_mask as bound method references from the BertModel instance.
Transformers v5.0 broke this pattern:

- get_head_mask: removed entirely (huggingface/transformers PR #41076)
- get_extended_attention_mask: 3rd parameter changed from `device` to `dtype`,
  causing a TypeError when passing a torch.device
- invert_attention_mask: uses self.dtype internally, same fragile binding pattern

All three methods are now reimplemented locally as proper class methods. The logic
is identical to the original transformers implementations -- these are simple mask
reshaping and dtype casting operations with no model-state dependencies.

Closes storyicon#110
Closes storyicon#111
@AmrKhaledDar
Copy link
Copy Markdown

It works thx a lot

@magiak
Copy link
Copy Markdown

magiak commented Apr 10, 2026

Confirming this PR fixes the 'BertModel' object has no attribute 'get_head_mask' error for me.

Environment:

ComfyUI 0.3.40
Python 3.11.10
torch 2.5.1+cu124
NVIDIA RTX 4090
Checked out the PR branch (git fetch origin pull/113/head:fix-113 && git checkout fix-113), restarted ComfyUI, and GroundingDinoSAMSegment now loads weights and produces correct masks from text prompts. Tested with SAM sam_vit_b + GroundingDINO_SwinT_OGC. 👍

Would be great to see this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants