[Feature] Add Evaluation & Benchmarking ScriptAdd files via upload by Diksha-3905 · Pull Request #49 · apple/ml-fastvlm

Diksha-3905 · 2025-07-27T06:43:34Z

Summary
This PR introduces a new benchmark.py script to evaluate FastVLM model checkpoints with:

Time-to-First-Token (TTFT) measurement

Latency per image

Simple accuracy metric (placeholder for VQA/captioning tasks)

CLI interface for easy use with different checkpoints and datasets

Key Changes
Added benchmark.py script with modular design:

Supports folder-based image datasets.

Uses build_model_and_transforms to load any FastVLM checkpoint.

Computes and logs TTFT, latency, and simple accuracy.

Outputs a summary of benchmark results.

Integrated with torchvision transforms for preprocessing.
Included CLI arguments for model path, image directory, and device selection.
Created a minimal dataset loader (ImageFolderDataset) for quick evaluation.

Usage
python benchmark.py
--model checkpoints/fastvlm_0.5b_stage3
--img-dir ./sample_images
--device cuda

Future Enhancements
Add COCO/VQA dataset loaders.

Integrate BLEU, CIDEr, and other standard metrics.

Support batch inference and multi-GPU evaluation.

Generate JSON/CSV reports and visual plots.

Testing
Verified with FastVLM-0.5B checkpoint on sample images.

Works on CUDA and CPU devices.

Checklist
Code compiles and runs without errors.

Tested basic benchmarking on sample images.

Added CLI interface and documented usage.

Diksha-3905 · 2025-07-27T06:44:43Z

feat: add evaluation & benchmarking script for FastVLM models

Introduced benchmark.py to evaluate model checkpoints.
Measures Time-to-First-Token (TTFT), average latency, and simple accuracy.
Added CLI interface for easy benchmarking with any image folder.
Implemented a lightweight dataset loader (ImageFolderDataset).
Prepared for future metrics (BLEU, CIDEr) and dataset integrations (COCO, VQA).

Usage:
python benchmark.py --model checkpoints/fastvlm_0.5b_stage3 --img-dir ./sample_images --device cuda

Add files via upload

2c67b9c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add Evaluation & Benchmarking ScriptAdd files via upload#49

[Feature] Add Evaluation & Benchmarking ScriptAdd files via upload#49
Diksha-3905 wants to merge 1 commit into
apple:mainfrom
Diksha-3905:main

Diksha-3905 commented Jul 27, 2025

Uh oh!

Diksha-3905 commented Jul 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Diksha-3905 commented Jul 27, 2025

Uh oh!

Diksha-3905 commented Jul 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant