Skip to content

CryAndRRich/insta-flow-edit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InstaFlowEdit

Python PyTorch License: MIT FLUX.1-dev SD3 InstaFlow

Overview

insta-flow-edit provides clean, plug-and-play implementations of state-of-the-art training-free text-based image editing algorithms, all unified under a single API and running on top of flow-matching / rectified-flow generative models

Flow-matching models parameterise a straight-line ODE between data $x_0$ and noise $\varepsilon$:

$$x_t = (1-t),x_0 + t,\varepsilon, \qquad v_\theta(x_t,,t) \approx \varepsilon - x_0$$

This linearity makes them uniquely suited for lightweight, inversion-free editing: the source latent trajectory can be offset, injected into, or gradient-guided toward a target semantics without any fine-tuning

Backbones Models

Model Architecture Native Resolution Link
FLUX.1-dev Diffusion Transformer (DiT) 1024 × 1024 [GitHub] [HuggingFace]
Stable Diffusion 3 Multimodal DiT (MMDiT) 1024 × 1024 [GitHub] [HuggingFace]
InstaFlow UNet (2-rectified flow) 512 × 512 [GitHub] [arXiv] [HuggingFace]

Editing Methods

Method FLUX SD3 InstaFlow Link
FlowEdit [x] [x] [x] [arXiv] [GitHub] [ProjectPage]
FireFlow [x] [x] [x] [arXiv] [GitHub]
FlowChef [x] [x] [x] [arXiv] [GitHub] [ProjectPage]
UniEdit-Flow [x] [x] [x] [arXiv] [GitHub] [ProjectPage]
FlowAlign [x] [x] [x] [arXiv] [GitHub]
TweezeEdit [x] [x] [x] [arXiv] [GitHub]
DVRF [x] [x] [x] [arXiv] [GitHub]
CVC [x] [x] [x] [arXiv]
ChordEdit [x] [x] [x] [arXiv] [GitHub] [ProjectPage]
VeloEdit [x] [x] [x] [arXiv] [GitHub]
FlowSlider [x] [x] [x] [arXiv] [HuggingFace]

Quick Start

Installation

pip install torch torchvision diffusers transformers accelerate \
            sentencepiece protobuf tqdm pillow

Data Format

Place source images in data/images/ and create data/dataset.csv:

name,source_prompt,target_prompts
cat,"a photo of a cat sitting on a sofa","a photo of a dog sitting on a sofa"

Usage

import torch
from src import get_sampler
from utils.load_data import load_data

# Load image + prompts (use resize_size=512 for InstaFlow)
image, src_prompt, tgt_prompt = load_data(
    image_dir="data/images",
    csv_path="data/dataset.csv",
    image_name="cat.jpg",
    resize_size=1024,
)

# Pick backbone ("flux" | "sd3" | "instaflow") and method
model = get_sampler("flux", "flowedit", model_key="black-forest-labs/FLUX.1-dev")

# Edit
result = model.sample(image, src_prompt, tgt_prompt, NFE=28, tar_cfg_scale=5.5, src_cfg_scale=1.5)

Switch backbone or method with a single line:

model = get_sampler("sd3", "fireflow", model_key="stabilityai/stable-diffusion-3-medium-diffusers")
model = get_sampler("instaflow", "flowalign", model_key="XCLiu/2_rectified_flow_from_sd_1_5")
model = get_sampler("flux", "tweezeedit", model_key="black-forest-labs/FLUX.1-dev")
model = get_sampler("sd3", "chordedit", model_key="stabilityai/stable-diffusion-3-medium-diffusers")

Pre-tuned hyperparameters for every (method, backbone) pair are available in configs/config.py.


Acknowledgements

This project adapts and unifies implementations from the following works:

FlowEdit - FireFlow - FlowChef - UniEdit-Flow - FlowAlign - TweezeEdit - DVRF - CVC - ChordEdit - VeloEdit - FlowSlider

About

Unified, training-free text-based image editing on flow-matching generative models

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages