Skip to content

Pull requests: huggingface/transformers

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix cross-attention cache layer type for T5Gemma2 long inputs
#45540 opened Apr 21, 2026 by Beichen-Ma Loading…
4 of 6 tasks
Revert #45045: changes break modular's purpose
#45539 opened Apr 21, 2026 by Cyrilvallez Member Loading…
[Sam3LiteText] Remove unnecessary modules/configs
#45535 opened Apr 20, 2026 by yonigozlan Member Loading…
ALM base model class
#45534 opened Apr 20, 2026 by eustlb Contributor Draft
1 of 2 tasks
[Model] Add SLANet Model Support
#45532 opened Apr 20, 2026 by zhang-prog Contributor Loading…
[CB] Changes for long generation
#45530 opened Apr 20, 2026 by remi-or Collaborator Draft
[Trainer] Add ddp_static_graph option
#45519 opened Apr 20, 2026 by KeitaW Loading…
4 of 5 tasks
2
3
T5Gemma2: fix prepare_decoder_input_ids_from_labels
#45516 opened Apr 19, 2026 by Tokarak Loading…
2 of 6 tasks
[Qwen3.5] Fix Qwen3.5 linear attention multi-token cached forward
#45513 opened Apr 19, 2026 by kashif Contributor Loading…
6 tasks
[OutputRecorder] re.search on layer_name
#45512 opened Apr 19, 2026 by eustlb Contributor Loading…
Add V-JEPA 2.1 inference support
#45497 opened Apr 17, 2026 by davevanveen Loading…
5 of 6 tasks
[WIP] Major processing refactor
#45493 opened Apr 17, 2026 by zucchini-nlp Member Loading…
Add ctsm model
#45490 opened Apr 17, 2026 by kashif Contributor Loading…
6 tasks
Align gemma3n cache sharing to gemma4
#45489 opened Apr 17, 2026 by Cyrilvallez Member Loading…
Fix model parallel issue for altclip model and ChineseClip model
#45487 opened Apr 17, 2026 by kaixuanliu Contributor Loading…
ProTip! Exclude everything labeled bug with -label:bug.