Skip to content

fix: NDViewer black screen during live TIFF acquisition#521

Open
maragall wants to merge 2 commits intoCephla-Lab:masterfrom
maragall:fix/ndviewer-tiff-race-condition
Open

fix: NDViewer black screen during live TIFF acquisition#521
maragall wants to merge 2 commits intoCephla-Lab:masterfrom
maragall:fix/ndviewer-tiff-race-condition

Conversation

@maragall
Copy link
Contributor

@maragall maragall commented Mar 24, 2026

Summary

NDViewer displayed all-black images during live acquisition for both Individual TIFF and OME-TIFF save modes.

Root cause: Save jobs are dispatched to a subprocess asynchronously via job_runner.dispatch(). The NDViewer filepath notification was emitted immediately after dispatch, before the file was written to disk. When NDViewer attempted to read the file ~200ms later, it did not exist, and the viewer silently fell back to zero-filled images.

Fix: TIFF save jobs (SaveImageJob, SaveOMETiffJob) now return a TiffWriteResult after writing completes, mirroring the existing ZarrWriteResult pattern. NDViewer is notified only after the subprocess confirms the write via the job-completion callback. For OME-TIFF, the result includes a page index so NDViewer reads the correct plane from the multi-page stack.

Depends on: Cephla-Lab/ndviewer_light#35 (page-indexed TIFF reading)

Changes

  • Add TiffWriteResult dataclass in job_processing.py
  • SaveImageJob and SaveOMETiffJob return TiffWriteResult instead of bool
  • Add signal_tiff_frame_written callback to MultiPointControllerFunctions
  • Handle TiffWriteResult in _process_job_result() to notify NDViewer post-write
  • Remove premature NDViewer notification from _signal_new_image_fn()

Test plan

  • python main_hcs.py --simulation with INDIVIDUAL_IMAGES — NDViewer displays images
  • Repeat with OME_TIFF — NDViewer displays images
  • Repeat with ZARR_V3 — NDViewer displays images
  • Existing unit tests pass (pytest tests/)

maragall and others added 2 commits March 24, 2026 13:08
Save jobs run in a subprocess asynchronously. The NDViewer notification
was fired immediately after dispatching the job, before the file was
written to disk. When NDViewer tried to read ~200ms later, the file
didn't exist, so it silently fell back to zero-filled (black) images.

This affected Individual TIFF and OME-TIFF save modes. The Zarr path
was not affected because it already notified via job-completion callback.

Changes:
- Add TiffWriteResult dataclass (mirrors ZarrWriteResult)
- SaveImageJob and SaveOMETiffJob now return TiffWriteResult after write
- Add signal_tiff_frame_written callback to MultiPointControllerFunctions
- Handle TiffWriteResult in _process_job_result() to notify NDViewer
- Remove premature NDViewer notification from _signal_new_image_fn()
- Add _signal_tiff_frame_written_fn() that emits after write completes
- OME-TIFF results include page index for correct plane reading

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes NDViewer showing black frames during live TIFF (individual + OME-TIFF) acquisition by deferring viewer notifications until the saving subprocess confirms a frame has been written to disk.

Changes:

  • Introduces a TiffWriteResult result type for TIFF save jobs, analogous to the existing Zarr result pattern.
  • Updates SaveImageJob / SaveOMETiffJob to return TiffWriteResult and wires a new signal_tiff_frame_written callback through the multipoint pipeline.
  • Removes the premature NDViewer notification from _signal_new_image_fn() and notifies NDViewer only from post-write callbacks.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
software/control/gui_hcs.py Stops emitting NDViewer TIFF registration on capture; adds a post-write TIFF callback handler.
software/control/core/multi_point_worker.py Handles TiffWriteResult job outputs and forwards them via a new callback to the GUI.
software/control/core/multi_point_utils.py Extends MultiPointControllerFunctions with a signal_tiff_frame_written callback.
software/control/core/job_processing.py Adds TiffWriteResult; changes TIFF job return types to carry viewer-notification metadata (including OME page index).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +155 to +156
"""Result from a SaveImageJob, containing frame info for viewer notification."""

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TiffWriteResult is used by both SaveImageJob and SaveOMETiffJob, but the docstring currently only mentions SaveImageJob. Updating the docstring to reflect all producers (and that filepath may include a page selector like "#") would prevent confusion for future consumers.

Suggested change
"""Result from a SaveImageJob, containing frame info for viewer notification."""
"""Result from SaveImageJob or SaveOMETiffJob, containing frame info for viewer notification.
The ``filepath`` may optionally include a page selector fragment (e.g. ``"#<idx>"``)
to indicate a specific page/frame within a multi-page TIFF/OME-TIFF file.
"""

Copilot uses AI. Check for mistakes.
Comment on lines +153 to +163
@dataclass
class TiffWriteResult:
"""Result from a SaveImageJob, containing frame info for viewer notification."""

filepath: str
fov: int
time_point: int
z_index: int
channel_name: str
region_id: int = 0

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior relies on SaveImageJob/SaveOMETiffJob returning a TiffWriteResult that is then interpreted by MultiPointWorker to notify the viewer. There are existing tests for ZarrWriteResult and OME-TIFF saving; adding unit tests that assert (1) SaveImageJob.run() returns a TiffWriteResult with the actual on-disk path/extension and (2) SaveOMETiffJob returns a page-indexed filepath would help prevent regressions of the black-screen race.

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +171
filepath = utils_acquisition.get_image_filepath(
info.save_directory, info.file_id, info.configuration.name, np.uint16
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SaveImageJob._make_result() hardcodes np.uint16 when constructing the filepath. If the captured image dtype is not uint16 (e.g., uint8 RGB or any non-uint16 format), utils_acquisition.save_image() will write a different extension/path than the one returned in TiffWriteResult, and NDViewer will be notified with a non-existent filepath. Construct the filepath using the actual image dtype (or otherwise mirror the exact path used by save_image()).

Suggested change
filepath = utils_acquisition.get_image_filepath(
info.save_directory, info.file_id, info.configuration.name, np.uint16
# Use the actual image dtype (if available) so the filepath matches the one used by save_image().
image_dtype = (
self.capture_image.image_array.dtype
if self.capture_image.image_array is not None
else np.uint16
)
filepath = utils_acquisition.get_image_filepath(
info.save_directory, info.file_id, info.configuration.name, image_dtype

Copilot uses AI. Check for mistakes.
Comment on lines 188 to +194
# Simulated disk I/O mode - encode to buffer, throttle, discard
if is_simulation_enabled():
bytes_written = simulated_tiff_write(image)
self._log.debug(
f"SaveImageJob {self.job_id}: simulated write of {bytes_written} bytes " f"(image shape={image.shape})"
)
return True
return result
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In simulated disk I/O mode, SaveImageJob returns a TiffWriteResult (triggering viewer notification) but simulated_tiff_write() explicitly discards the buffer and does not create a file on disk. This will cause NDViewer to try to read a filepath that will never exist when SIMULATED_DISK_IO_ENABLED is true. Consider suppressing signal_tiff_frame_written in this mode or writing a real temporary file when the viewer is active.

Copilot uses AI. Check for mistakes.
signal_zarr_frame_written: Callable[[int, int, int, str, int], None] = lambda *a, **kw: None
# TIFF frame written callback - called when subprocess completes writing a frame
# Args: (filepath, fov, time_point, z_index, channel_name, region_id)
signal_tiff_frame_written: Callable[[str, int, int, int, str, int], None] = lambda *a, **kw: None
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signal_tiff_frame_written uses a parameter named region_id typed as int, but region identifiers in multipoint scans appear to be region names/IDs coming from scan_region_names (strings). Using int typing here is misleading and can cause accidental key mismatches (e.g., dicts keyed by region name). Consider typing this as str (or a Union[str, int]) consistently across the callback and TiffWriteResult.

Suggested change
signal_tiff_frame_written: Callable[[str, int, int, int, str, int], None] = lambda *a, **kw: None
signal_tiff_frame_written: Callable[[str, int, int, int, str, Union[str, int]], None] = lambda *a, **kw: None

Copilot uses AI. Check for mistakes.
Comment on lines +439 to +441
region_offset = self._ndviewer_region_fov_offset.get(region_id)
if region_offset is None:
return
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If region_id is not found in _ndviewer_region_fov_offset, this silently returns with no logging. This makes NDViewer failures hard to diagnose (and differs from the earlier warning behavior). Consider logging a warning (including region_id and available keys) before returning, or at least log at debug level.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants