Skip to content

deepanshu9012/Image-Caption-Generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Image Caption Generator with Transformers & Streamlit

Overview

This project is a modern and efficient Image Caption Generator built as an interactive web application using Streamlit. It leverages a state-of-the-art, pre-trained model from the Hugging Face Transformers library to automatically generate descriptive captions for any uploaded image.

How It Works

The application uses a powerful image-to-text pipeline powered by the ydshieh/vit-gpt2-coco-en model. This model combines a Vision Transformer (ViT) to understand the visual content of the image and a GPT-2 language model to generate a coherent, human-like caption.

The entire application is wrapped in a user-friendly interface created with Streamlit, allowing users to easily upload an image and view the generated caption in real-time.

Key Technologies

  • Streamlit: For building the interactive web UI.
  • Hugging Face Transformers: For accessing the pre-trained ViT-GPT2 model.
  • Pillow (PIL): For image processing.
  • PyTorch: As the backend framework for the model.

Setup and Usage

  1. Install the required libraries:

    pip install streamlit transformers torch Pillow
  2. Save the code as a Python file (e.g., app.py).

  3. Run the application from your terminal:

    streamlit run app.py
  4. Upload an image through the web interface to see the result.

About

In this Project we have developed Image Caption Generator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors