Skip to content

Fix checkpoint path bug, mask channel inconsistency, and a few related cleanups#240

Open
kyroX-D wants to merge 11 commits intonikopueringer:mainfrom
kyroX-D:main
Open

Fix checkpoint path bug, mask channel inconsistency, and a few related cleanups#240
kyroX-D wants to merge 11 commits intonikopueringer:mainfrom
kyroX-D:main

Conversation

@kyroX-D
Copy link
Copy Markdown
Contributor

@kyroX-D kyroX-D commented Apr 26, 2026

Hey! Was reading through the codebase and noticed a few things that
looked off figured I'd fix them and send a PR rather than just sit
on it. Happy to split this up if you'd prefer smaller PRs, just let me know.

bugs

CHECKPOINT_DIR was defined twice in CorridorKeyModule/backend.py
The absolute path at the top of the file was silently shadowed by a
relative redefinition further down. This broke model lookup and the
auto-download path whenever the tool was launched from a directory
other than the repo root. Removed the second definition.

Video alpha masks were read from the wrong channel
clip_manager.py and backend/frame_io.py pulled BGR[2] (red) out of
video mask frames, while image masks went through the blue channel /
grayscale path elsewhere. For grayscale-encoded sources the two paths
happen to agree, but for any color-encoded mask they silently disagree.
Both video paths now go through cv2.cvtColor(BGR2GRAY), which also
matches what the existing VideoMaMa mask pipeline does. Also fixed the
docstring in frame_io.py that claimed index 2 in BGR was the blue
channel it's red.

Frame sequences were sorted lexicographically in the CLI path
For non-zero-padded sequences (frame_2.png, frame_10.png) that puts
frame_10 before frame_2, so frames got processed out of order.
backend/clip_state.py already used natural sort via natsorted();
the CLI just hadn't caught up. Both paths now use the same helper from
backend/natural_sort.py.

Cleanup branch in the failed MLX download pointed at the wrong path
When the download failed mid-stream, the code tried to remove model_path
but the partial file lived under cache_path (.tmp). The wrong file
was checked, so the leftover .tmp was never actually cleaned up.

Cleanups

  • Dropped the leftover __main__ block at the bottom of clip_manager.py.
    It raised NotImplementedError for the wizard action and was fully
    superseded by corridorkey_cli.py.
  • Removed a duplicate import numpy as np from inside run_inference()
    already imported at module level.
  • Replaced three traceback.print_exc() sites with logger.exception()
    so stack traces flow through RichHandler and respect the configured
    logging setup instead of bypassing it onto stderr.
  • Moved the ffmpeg_tools import out of the per-frame loop up to the
    module header where it belongs.
  • Added pytest-env to the dev dependency group. pyproject.toml
    already declared env = [...] under [tool.pytest.ini_options] to
    set OPENCV_IO_ENABLE_OPENEXR=1, but the option was silently ignored
    without pytest-env installed, so EXR-related tests failed locally
    on Windows.
  • A few minor lint/style tweaks via ruff --fix (unused argparse
    import, sorted import block, trailing whitespace).

Testing

uv run pytest tests/ — 363 passed, 5 skipped (GPU/MLX-only, expected).
On Windows the EXR tests pass too once pytest-env is installed via
uv sync.

Let me know if anything looks weird or you want anything reworked!

kyroX-D added 11 commits April 14, 2026 05:45
Introduce a prefetch thread that reads frames ahead of GPU processing and a background write pool for disk output. This prevents the GPU from idling during disk I/O, significantly improving throughput on 4K EXR sequences.
The CUDA allocator was forced to release and re-acquire memory blocks on every single frame, causing unnecessary overhead and fragmentation. Cache clearing is already handled at model-switch boundaries in service.py._ensure_model().
CHECKPOINT_DIR was defined twice the second, relative definition silently shadowed the correct absolute one above it. This broke model lookup whenever the tool was run from a different working directory than the repo root. The cleanup branch in the failed MLX download path also pointed at
model_path instead of cache_path, so the partial .tmp file was
never actually removed. Also drops an orphaned comment that sat in the dead-code region right after `return "torch"`.
Refactor auto-download logic to use cache path for cleanup.
Several related cleanups in clip_manager.py and the matching
helper in backend/frame_io.py:

- Video alpha masks now go through cv2.cvtColor(BGR2GRAY) instead of grabbing BGR[2]. Image masks already used grayscale, so both paths agree now. Misleading "blue channel" docstring in
frame_io.py corrected too index 2 in BGR is red.

- Frame sequences now sort via backend.natural_sort.natsorted().
  Lexicographic sort put frame_10 before frame_2 for non-padded
  sequences. backend/clip_state.py already did this; CLI path
  caught up.

- Dropped the dead __main__ block at the end it raised
  NotImplementedError and was superseded by corridorkey_cli.py.

- Removed a duplicate `import numpy as np` from inside
  run_inference(); already imported at module level.

- Replaced three traceback.print_exc() sites with
  logger.exception() so stack traces flow through RichHandler.

- Moved the ffmpeg_tools import out of the per-frame loop up to
  the module header.
read_video_mask_at() pulled BGR[2] (red) and the docstring claimed
it was the blue channel. Neither was right: image masks elsewhere
go through grayscale, so video masks now do the same via
cv2.cvtColor(BGR2GRAY). Docstring updated to match.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant