fix: reimplement BertModelWarper utility methods for transformers v5 compatibility#113
Open
djdarcy wants to merge 1 commit intostoryicon:mainfrom
Open
fix: reimplement BertModelWarper utility methods for transformers v5 compatibility#113djdarcy wants to merge 1 commit intostoryicon:mainfrom
djdarcy wants to merge 1 commit intostoryicon:mainfrom
Conversation
…compatibility BertModelWarper previously copied get_head_mask, get_extended_attention_mask, and invert_attention_mask as bound method references from the BertModel instance. Transformers v5.0 broke this pattern: - get_head_mask: removed entirely (huggingface/transformers PR #41076) - get_extended_attention_mask: 3rd parameter changed from `device` to `dtype`, causing a TypeError when passing a torch.device - invert_attention_mask: uses self.dtype internally, same fragile binding pattern All three methods are now reimplemented locally as proper class methods. The logic is identical to the original transformers implementations -- these are simple mask reshaping and dtype casting operations with no model-state dependencies. Closes storyicon#110 Closes storyicon#111
|
It works thx a lot |
|
Confirming this PR fixes the 'BertModel' object has no attribute 'get_head_mask' error for me. Environment: ComfyUI 0.3.40 Would be great to see this merged. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TLDR version:
transformers >= 5.0get_head_mask,get_extended_attention_mask,invert_attention_mask) as local class methods inBertModelWarperinstead of copying them as bound references from theBertModelinstanceProblem
BertModelWarper.__init__previously copied three methods directly from theBertModelinstance:Transformers v5.0 broke this pattern in multiple ways:
get_head_maskwas removed entirely (huggingface/transformers PR #41076), causingAttributeError: 'BertModel' object has no attribute 'get_head_mask'get_extended_attention_maskchanged its 3rd parameter fromdevicetodtype, causingTypeError: to() received an invalid combination of arguments - got (dtype=torch.device, )invert_attention_maskstill works today but uses the same fragile bound-method pattern and referencesself.dtypeinternallyFix
All three methods are reimplemented as proper class methods on
BertModelWarper. The logic is identical to the original transformers implementations -- these are simple mask reshaping and dtype casting operations with no model-state dependencies beyondself.config.This makes
BertModelWarperfully self-contained and independent of the transformers library version.Some basic testing done
transformers==5.3.0-- GroundingDINO model loads and runs inference successfullyCloses #110
Closes #111