Skip to content

Feature request: Enable image inputs for Custom Providers (vision parity + simple toggle) #388

@Sander-Chen

Description

@Sander-Chen

Hi ChainForge team — huge thanks for the amazing tool and the new Media Node image-input workflow. It’s been great with the built-in providers. 🙏

What I’m trying to do
Use a Custom Provider that wraps a third-party model which does support vision in other popular clients. With built-in providers (e.g., GPT-4o / Gemini / Claude / Ollama), image understanding works as expected in ChainForge.

Repro steps (minimal, product-level)

  1. Add a Media Node, upload a real-life photo (e.g., a cat).
  2. In a Prompt Node, include {image} so I can connect the Media Node → Prompt Node.
  3. Run with a built-in vision model → ✅ It correctly says it’s a cat (or otherwise describes the photo).
  4. Switch to my Custom Provider (wrapping a vision-capable third-party model) and ask the same question like “What’s in this image?” → ❌ It returns unrelated text as if no image was sent (e.g., says “this is a Python tutorial”), i.e., hallucinations consistent with the image not being received.

Expected
Custom Providers receive the same image input that built-in providers do, so vision-capable third-party models can recognize what’s in the photo.

Actual
When using a Custom Provider, responses look like the model didn’t get the image at all, leading to off-topic/hallucinated text.

Why I think this matters / related notes
Release notes for v0.3.6 say image inputs are currently limited to OpenAI, Anthropic, Google, and Ollama. This matches what I’m seeing (works with built-ins, not with Custom Provider). I’m hoping Custom Providers can get parity here without me having to modify code. ([GitHub][1])

Requests (feature & docs)

  • Please enable image inputs for Custom Providers so they can receive Media Node images just like built-ins.
  • If possible, make this a simple UI/setting toggle, e.g., “Image support: enable/disable” for a Custom Provider—so users don’t need to patch code.
  • If this is already possible, could you add user-facing documentation that explains the recommended way to get Media Node images into a Custom Provider (no deep implementation details needed—just how to turn it on or what option/parameter to use)?

Environment

  • ChainForge version: v0.3.6
  • Install: local (pip install chainforgechainforge serve)
  • OS/Browser: Windows 11 + Chrome (also reproducible on my setup)
  • Custom Provider: wraps a third-party model that handles vision correctly in other clients

Thanks again for all your work on ChainForge — the Media Node feature is already super helpful, and adding Custom Provider image support (ideally via a simple toggle) would make it even more powerful. ❤️

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions