Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
23631b1
feat: add LWDetr model
sbucaille Aug 28, 2025
0eb69e6
fix: changed LwDetrVit base classes from VitDet to ViT
sbucaille Sep 21, 2025
12fa5e2
tests: added tests for LWDetr
sbucaille Sep 22, 2025
aceb10c
refactor: fix all issues and created docs
sbucaille Sep 22, 2025
bd48206
tests: added missing lw_detr_vit tests
sbucaille Sep 22, 2025
06c7d70
docs: add lwdetr docs
sbucaille Sep 22, 2025
a2ef8c3
fix: fixed implementation error and associated tests
sbucaille Sep 26, 2025
9faeaee
chore: removed testing lib in imports
sbucaille Sep 26, 2025
0fad340
refactor: replace LwDetrImageProcessor with DeformableDetrImageProcessor
sbucaille Sep 30, 2025
a89f8f2
refactor: remove two-stage detection and bounding box reparameterizat…
sbucaille Sep 30, 2025
e942891
refactor: rename LwDetrCSPRepLayer to LwDetrC2FLayer
sbucaille Sep 30, 2025
f9e9631
refactor: introduce LwDetrMLP for feedforward layers in decoder
sbucaille Sep 30, 2025
9cf545d
refactor: replace build_position_encoding with LwDetrSinePositionEmbe…
sbucaille Sep 30, 2025
199b2bc
refactor: remove use_cae parameter and related logic from configurati…
sbucaille Sep 30, 2025
ab6096f
refactor: remove unused variables and simplify certain instructions
sbucaille Sep 30, 2025
b95de12
refactor: removed unnecessary one line instruction method with_pos_embed
sbucaille Oct 18, 2025
0af67c2
refactor: use llama attention formatting for hidden shape
sbucaille Oct 18, 2025
15625a5
docs: add comments about group detr
sbucaille Oct 18, 2025
22d66d2
fix: removed wrong sigmoid and fixed init for class_embed
sbucaille Oct 18, 2025
5b7f657
refactor: removed unused positional embeddings classes and weights fr…
sbucaille Oct 19, 2025
b075292
chore: removed unused import
sbucaille Oct 19, 2025
d863a68
chore: make style and repo-consistency after positional embeddings re…
sbucaille Oct 19, 2025
d6fdd91
refactor: removed unused drop path rate
sbucaille Oct 21, 2025
13ad4a8
fix: ingest latest changes from rebase
sbucaille Oct 21, 2025
8147b45
fix: attn_implementation setter
sbucaille Oct 21, 2025
25fbaab
fix: is causal set to False
sbucaille Oct 21, 2025
d5b24a6
refactor: renamed ffn to mlp and moved layer norm out of mlp
sbucaille Oct 21, 2025
fa6deed
fix: check model inputs
sbucaille Oct 21, 2025
75b3f1f
fix: moved super init call in LwDetrConfig
sbucaille Oct 21, 2025
f998abb
fix: super class in GradientCheckpointingLayer
sbucaille Oct 21, 2025
e164b8a
fix: replaced RTDetr occurences by LwDetr in test modeling file
sbucaille Oct 21, 2025
05afaa7
refactor: removed head_mask from LwDetrViT
sbucaille Oct 21, 2025
ff48821
docs: added release date in docs
sbucaille Oct 21, 2025
c627a2a
fix: added missing attention mask argument
sbucaille Oct 21, 2025
755c5b8
chore: make style & repo-consistency
sbucaille Oct 21, 2025
8b72816
fix: ensure tensor dtype consistency in loss calculations and test cases
sbucaille Oct 21, 2025
2b9ebff
docs: fixed model release date
sbucaille Oct 22, 2025
6e4f583
refactor: removed unnecessary module cloning
sbucaille Oct 30, 2025
97c1d37
tests: added missing _prepare_for_class method and removed batching_e…
sbucaille Oct 30, 2025
abae375
tests: added xlarge integration test
sbucaille Oct 30, 2025
e037e63
chore: added lw_detr reference in image processing auto
sbucaille Nov 7, 2025
9deff21
chore: removed unnecessary properties from LwDetrConfig
sbucaille Nov 7, 2025
be4fc9f
fix: fix for latest main changes
sbucaille Nov 22, 2025
7556efc
fix: apply modular changes from mail
sbucaille Dec 3, 2025
77a94e7
docs: update model doc and docstrings
sbucaille Dec 3, 2025
fec5db9
fix: style
sbucaille Dec 3, 2025
4cdc807
fix: update output values in convert script
sbucaille Dec 3, 2025
df6f2ed
feat: added proper last_hidden_states in LwDetrDecoderOutput and sepa…
sbucaille Dec 4, 2025
138b009
fix: guard accelerate imports
sbucaille Dec 9, 2025
d71dbb8
fix: removed LWDetrConfig attribute map and changed LwDetrAttention i…
sbucaille Dec 9, 2025
f9e60b4
fix: parameterize amap based on config
sbucaille Dec 9, 2025
635f527
fix: remove redundant decorator
sbucaille Dec 9, 2025
eeac74a
chore: moved LwDetrViT to LwDetr single modular file
sbucaille Dec 9, 2025
514536e
fix: remove unnecessary attribute_map in LwDetrViT
sbucaille Dec 9, 2025
e82f6d5
chore: simplified LwDetr modules methods with proper hidden_states re…
sbucaille Dec 9, 2025
13e9aa3
fix: replaced hardcoded value by variable
sbucaille Dec 9, 2025
082715b
tests: added VitDet and attention tests
sbucaille Dec 9, 2025
74c47a7
fix: modular conversion
sbucaille Dec 9, 2025
865739f
tests: moved LwDetrViT tests to test_modeling_lw_detr file
sbucaille Dec 9, 2025
dec88cd
docs: add lwdetr advances in docs
sbucaille Dec 10, 2025
b7821a3
refactor: removed arguments to classes as much as possible and rely o…
sbucaille Dec 12, 2025
c9809f8
Merge branch 'main' into add_lw_detr
Cyrilvallez Jan 8, 2026
ad93a7b
reapply style, remove LlamaAttention inheritance to remove decorator
Cyrilvallez Jan 8, 2026
e5a0446
chore: updated licence and year
sbucaille Jan 10, 2026
003d63d
fix: removed torch.nn.functional from modular
sbucaille Jan 10, 2026
99490ca
docs: removed redundant docstring arguments covered by autodocstring …
sbucaille Jan 10, 2026
11727ab
refactor: removed backbone api statements
sbucaille Jan 10, 2026
8b8feb8
fix: added back num_key_value_groups in LwDetrAttention
sbucaille Jan 10, 2026
c68e713
chore: removed unnecessary copied from statement
sbucaille Jan 10, 2026
d0cfb7a
chore: moved LwDetrViT modules above LwDetr modules
sbucaille Jan 10, 2026
c03d5ee
tests: removed unnecessary overwrite and “test_” attributes
sbucaille Jan 10, 2026
3342ea8
docs: added missing docs
sbucaille Jan 10, 2026
b01b7c5
Merge remote-tracking branch 'upstream/main' into add_lw_detr
sbucaille Jan 10, 2026
8e2753e
style: remove unnecessary parentheses
sbucaille Jan 10, 2026
06ac9fb
docs: added back logits docstring
sbucaille Jan 10, 2026
b150f23
docs: added docs dates
sbucaille Jan 10, 2026
6f89388
Merge branch 'main' into add_lw_detr
Cyrilvallez Jan 12, 2026
8a7818f
style details
Cyrilvallez Jan 12, 2026
e5c20a0
unessecary utf8
Cyrilvallez Jan 12, 2026
c745f90
might as well skip all config checks
Cyrilvallez Jan 12, 2026
098fb4d
embeddings are large, increase model_split_percents
Cyrilvallez Jan 12, 2026
93ba55a
fix device issue
Cyrilvallez Jan 12, 2026
aabdb76
update logits
Cyrilvallez Jan 12, 2026
0493e06
set device in expectations
Cyrilvallez Jan 12, 2026
1312458
add to toctree
Cyrilvallez Jan 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -817,6 +817,8 @@
title: LeViT
- local: model_doc/lightglue
title: LightGlue
- local: model_doc/lw_detr
title: LW-DETR
- local: model_doc/mask2former
title: Mask2Former
- local: model_doc/maskformer
Expand Down
127 changes: 127 additions & 0 deletions docs/source/en/model_doc/lw_detr.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
<!--Copyright 2026 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.

-->
*This model was released on 2024-04-05 and added to Hugging Face Transformers on 2026-01-10.*

<div style="float: right;">
<div class="flex flex-wrap space-x-1">
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
</div>
</div>

# LW-DETR

[LW-DETR](https://huggingface.co/papers/2407.17140) proposes a light-weight Detection Transformer (DETR) architecture designed to compete with and surpass the dominant YOLO series for real-time object detection. It achieves a new state-of-the-art balance between speed (latency) and accuracy (mAP) by combining recent transformer advances with efficient design choices.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be nice to (very) briefly describe or mention the recent advances and design choices :)


The LW-DETR architecture is characterized by its simple and efficient structure: a plain ViT Encoder, a Projector, and a shallow DETR Decoder.
It enhances the DETR architecture for efficiency and speed using the following core modifications:
1. Efficient ViT Encoder: Uses a plain ViT with interleaved window/global attention and a window-major organization to drastically reduce attention complexity and latency.
2. Richer Input: Aggregates multi-level features from the encoder and uses a C2f Projector (YOLOv8) to pass two-scale features ($1/8$ and $1/32$).
3. Faster Decoder: Employs a shallow 3-layer DETR decoder with deformable cross-attention for lower latency and faster convergence.
4. Optimized Queries: Uses a mixed-query scheme combining learnable content queries and generated spatial queries.

You can find all the available Deformable DETR checkpoints under the [stevenbucaille](https://huggingface.co/stevenbucaille) organization.
The original code can be found [here](https://github.com/Atten4Vis/LW-DETR).

> [!TIP]
> This model was contributed by [stevenbucaille](https://huggingface.co/stevenbucaille).
>
> Click on the LW-DETR models in the right sidebar for more examples of how to apply LW-DETR to different object detection tasks.


The example below demonstrates how to perform object detection with the [`Pipeline`] and the [`AutoModel`] class.

<hfoptions id="usage">
<hfoption id="Pipeline">

```python
from transformers import pipeline
import torch

pipeline = pipeline(
"object-detection",
model="stevenbucaille/lwdetr_small_60e_coco",
dtype=torch.float16,
device_map=0
)

pipeline("http://images.cocodataset.org/val2017/000000039769.jpg")
```

</hfoption>
<hfoption id="AutoModel">

```python
from transformers import AutoImageProcessor, AutoModelForObjectDetection
from PIL import Image
import requests
import torch

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

image_processor = AutoImageProcessor.from_pretrained("stevenbucaille/lwdetr_small_60e_coco")
model = AutoModelForObjectDetection.from_pretrained("stevenbucaille/lwdetr_small_60e_coco")

# prepare image for the model
inputs = image_processor(images=image, return_tensors="pt")

with torch.no_grad():
outputs = model(**inputs)

results = image_processor.post_process_object_detection(outputs, target_sizes=torch.tensor([image.size[::-1]]), threshold=0.3)

for result in results:
for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]):
score, label = score.item(), label_id.item()
box = [round(i, 2) for i in box.tolist()]
print(f"{model.config.id2label[label]}: {score:.2f} {box}")
```

</hfoption>
</hfoptions>


## Resources

A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with LwDetr.

<PipelineTag pipeline="object-detection"/>

- Scripts for finetuning [`LwDetrForObjectDetection`] with [`Trainer`] or [Accelerate](https://huggingface.co/docs/accelerate/index) can be found [here](https://github.com/huggingface/transformers/tree/main/examples/pytorch/object-detection).
- See also: [Object detection task guide](../tasks/object_detection).

## LwDetrConfig

[[autodoc]] LwDetrConfig

## LwDetrViTConfig

[[autodoc]] LwDetrViTConfig

## LwDetrModel

[[autodoc]] LwDetrModel
- forward

## LwDetrForObjectDetection

[[autodoc]] LwDetrForObjectDetection
- forward

## LwDetrViTBackbone

[[autodoc]] LwDetrViTBackbone
- forward
Loading