This project is a colab notebook demo that analyzes a food plate image to detect food items, estimate quantities and calories, compute total macronutrients, give a simple nutritional score, and offer suggestions to improve meal balance. It uses an image-to-text pipeline (image encoded as base64), a Large Language Model (via LangChain/OpenAI wrappers) for interpretation, and Gradio for the interactive UI.
Food_Calorie_Tracker.ipynb— main notebook (encoding, LLM prompt, Gradio UI).env(recommended) — contains OPENAI_API_KEY if using OpenAI- (Optional)
requirements.txt— list of Python dependencies
Recommended Python packages (create a virtual environment before installing):
- pillow
- gradio
- langchain-openai (or
langchain+ compatible OpenAI client) - langchain-core
- python-dotenv
- requests (optional)
Example requirements.txt snippet:
pillow
gradio
langchain-openai
langchain-core
python-dotenv
requests
- Create and activate a virtual environment (PowerShell example):
python -m venv .venv; .\.venv\Scripts\Activate.ps1
pip install -r requirements.txt- Create a
.envfile in the notebook directory with your OpenAI API key:
OPENAI_API_KEY=your_api_key_here
- Open
Food_Calorie_Tracker.ipynbin Jupyter or VS Code and run cells top-to-bottom. The Gradio demo cell launches an interactive UI (the notebook usesdemo.launch(share=True)).
- Upload a plate image (clear, top-down photo works best).
- Select Meal Type (Breakfast/Lunch/Dinner) and Diet Type (Vegan/Vegetarian/Non-Vegetarian).
- Click "Analyze" and wait for the streamed LLM output — the UI shows progress as the model responds.
- User uploads a plate image in Gradio (the notebook encodes it to base64).
- The notebook builds a structured nutrition prompt and embeds the base64 image as an
image_urlobject. - The prompt is sent to an LLM via LangChain’s ChatOpenAI wrapper. Responses are streamed back so the UI shows incremental progress.
- Output is expected in a fixed markdown/table format (Items Detected, Total Nutrition, Nutritional Score, What's Missing).
- Results are displayed in the Gradio Markdown output area.
- The notebook uses a strict format in the prompt asking the model to return:
- A table of detected items with Quantity, Calories, Protein, Carbs, Fat
- Total nutrition summary
- Nutritional score (X/10 components + overall)
- Three suggestions to improve balance
- Keeping this fixed format makes downstream parsing easier.
- The notebook relies on the LLM to interpret images via base64 embedding inside text prompts — accuracy depends on LLM multimodal capability.
- Calories and portions are approximate; not a certified nutrition tool.
- Using
demo.launch(share=True)exposes a public Gradio URL while running — be careful with sensitive images. - Streaming capability depends on the installed LangChain/OpenAI wrapper supporting
llm.stream. - Network and API costs: using OpenAI will consume tokens and may be billable.
- Uploaded images are included in prompts and sent to the model provider. If you need local-only processing, replace the LLM step with a local vision model or remove
share=Truefrom Gradio. - Never commit
.envto source control.
- Test with a few plate photos of varying complexity: single-item (banana), mixed plate (rice + curry + veg), and multi-serving plates.
- Inspect the raw LLM output if parsing fails; adjust the prompt to enforce exact formatting.
- If streaming fails, switch to a non-streamed Chat API call to get the whole output at once.
- Integrate a local vision model (CLIP/YOLO) to detect items and send structured labels to the LLM.
- Add a local calorie database to convert detected items to better calorie estimates.
- Wrap the notebook into a Flask/FastAPI service for production deployment and add tests.