This is the official Pytorch implementation of Grounded Teacher:
"Grounded Teacher" by Tajamul Ashraf, Rajes Manna, Partha Sarathi Purkayastha, Tavaheed Tariq and Janibul Bashir.
- [04/26/2025] π₯ We achieved new SoTA with 50.8 box mAP on Cityscapes to Foggy Cityscapes!
- [04/21/2025] We released an arxiv version.. See more details in our updated arxiv!
Grounded Teacher (GT) is a standard framework in Source Free Object Detection (SFOD) designed to tackle context bias and performance drop of the student model. It models contextual relationships using a dedicated relational context module and leverages this to mitigate inherent biases. GT applies augmentations to closely related classes across and within domains, enhancing underrepresented class performance while minimizing effects on dominant classes. An expert foundational branch supervises the student model, improving prediction quality under the SFOD setting.
π₯ Check out our website for more overview!
Due to dependency conflicts, this project requires two separate environments. Follow the instructions below to set up each environment correctly.
Requirements:
- Python >= 3.8
- PyTorch = 1.7.1 and torchvision that matches the PyTorch installation.
- Linux, CUDA >= 11.0
- Install Faster-RCNN:
cd Source/lib python setup.py build develop - Other requirements:
cd ../ pip install -r requirements.txt
β οΈ For 30XX series GPUs, set CUDA architecture:
export TORCH_CUDA_ARCH_LIST="8.0"Requirements:
- Python β₯ 3.6
- PyTorch β₯ 1.5 and torchvision that matches the PyTorch installation.
- Detectron2 == 0.6
- Other requirements:
cd Expert pip install -r assets/requirements/requirements.txt
You can download the Medical datasets from here
City to Foggy dataset Structure:
βββ cityscapes/
βββ gtFine/
| βββ train/
| βββ test/
| βββ val/
βββ leftImg8bit/
| βββ train/
| βββ test/
| βββ val/
βββ cityscapes_foggy/
βββ gtFine/
| βββ train/
| βββ test/
| βββ val/
βββ leftImg8bit/
βββ train/
βββ test/
βββ val/
Other datasets must follow the Pascal VOC format structure:
datasets/
βββ VOC_format_dataset/
βββ Annotations/ # XML annotation files
βββ ImageSets/
β βββ Main/
β βββ train.txt # List of training image IDs
β βββ test.txt # List of testing image IDs
βββ JPEGImages/ # Original images
For the VGG backbone, we use converted weights from CMT.
- Download the pretrained weights from Google Drive Link
- Place the weights file at:
checkpoints/vgg16_bn-6c64b313_converted.pth
- Download the Pretrained model checkpoint from Google Drive Link.
- Place the weights file at:
Expert/pretrained/biomedparse_v1.pt
The implementation follows a step-by-step process for domain adaptation in medical image analysis, requiring switching between environments for each step:
We are demonstrating the DDSM to RSNA.
First, ensure the VOC_MEDICAL path is correctly set in your Grounded_Teacher/Source/lib/datasets/config_dataset.py.
Download vgg16_caffe.pth and then change the path in Grounded_Teacher/Source/lib/model/utils/config.py.
-
π Train on source domain:
cd Source python trainval_pretrain_adv.py \ --dataset voc_medical_train \ --dataset_t voc_medical \ --net vgg16 \ --log_ckpt_name "DDSMSource" \ --save_dir "output"
-
π·οΈ Generate pseudo-labels:
python psudo_label_generation.py \ --dataset_t voc_medical \ --net vgg16 \ --log_ckpt_name "PseudoL_ddsm2rsna" \ --save_dir "output" \ --load_name "output/vgg16/DDSMSource/lg_adv_session_1_epoch_6_step_10000.pth" -
π¦ Generate a new RSNA directory containing source labels by executing the
scripts/GenerateSF.ipynbnotebook.cd ../scripts python generateSF.py
π Make sure to switch your environment from source_train to other
- Generate expert labels:
This will create the file
cd ../Expert python prediction.py --root "<DATASET_PATH>"
rsna_results.txt.
-
π Update configuration files:
- Set TRAIN_LABEL to RSNA_sf with Source pseudo-labels
- Set TRAIN_UNLABEL to RSNA
- Set TEST to RSNA with ground truth
- Set EXPERT_PATH to RSNA Expert pseudo-Labels
-
π Run training:
python train_net.py \ --num-gpus 1 \ --config configs/faster_rcnn_VGG_cross_city.yaml \ OUTPUT_DIR output/ddsm2rsna
-
π Calculate Froc:
python eval.py --setting ddsm2rsna --root output/ddsm2rsna
| backbone | training stage | R@0.3 | logs & weights |
|---|---|---|---|
| vgg16 | source_only | 0.31 | logs & weights |
| vgg16 | cross_domain_mae | 0.28 | logs & weights |
| backbone | training stage | R@0.3 | logs & weights |
|---|---|---|---|
| vgg16 | source_only | 0.30 | logs & weights |
| vgg16 | cross_domain_mae | 0.43 | logs & weights |
| backbone | training stage | R@0.3 | logs & weights |
|---|---|---|---|
| vgg16 | source_only | 0.24 | logs & weights |
| vgg16 | cross_domain_mae | 0.43 | logs & weights |
Our implementation builds upon the following excellent repositories and research contributions:
- CAT, AASFOD, and BiomedParse:
We thank the authors for making their work publicly available.
If you find this work useful, please cite our paper
@misc{ashraf2025contextawaregroundedteacher,
title={Context Aware Grounded Teacher for Source Free Object Detection},
author={Tajamul Ashraf and Rajes Manna and Partha Sarathi Purkayastha and Tavaheed Tariq and Janibul Bashir},
year={2025},
eprint={2504.15404},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2504.15404},
}
