Skip to content

gpuResampSlc.cu: fix out of bounds reading bug in transformTile#214

Closed
scottstanie wants to merge 1 commit intoisce-framework:developfrom
scottstanie:fix-gpu-LUT2d-bounds
Closed

gpuResampSlc.cu: fix out of bounds reading bug in transformTile#214
scottstanie wants to merge 1 commit intoisce-framework:developfrom
scottstanie:fix-gpu-LUT2d-bounds

Conversation

@scottstanie
Copy link
Member

@scottstanie scottstanie commented Feb 9, 2026

(note: have not been able to recompile this on the GPU machine to fully test. just makes logical sense now)

I was getting the following error sporadically when running CUDA versions of geo2rdr + gpuResampleSlc

  Reading in image data for tile 38 of 39
  Interpolating  tile 38 of 39
  terminate called after throwing an instance of 'isce3::cuda::except::CudaError<cudaError>'
    what():  Error in file .../cxx/isce3/cuda/core/gpuLUT1d.cu, line 77, function isce3::cuda::core::gpuLUT1d< <template-parameter-1-1>
  >::~gpuLUT1d() [with T = double]: cudaError 700 (an illegal memory access was encountered)
  zsh: IOT instruction (core dumped)

I thought it was a "my image is too big problem", since shrinking the linesPerBlock worked (sometimes). Other times, shrinking it didnt work, but just changing it did.

The CPU version in cxx/isce3/image/ResampSlc.cpp:322-327 has the correct bounds check:

      if ((iRowResampled < chipHalf) ||
              (iRowResampled >= (inLength - chipHalf)))
          continue;
      if ((iColResampled < chipHalf) || (
                  iColResampled >= (inWidth - chipHalf)))
          continue;
``
However, the GPU version has only `>`.
The chip reading loop iterates iChipRow from 0 to chipSize - 1 (0 to 8), accessing row:
```c++
    iTileRow = iRowResamp + iChipRow - chipHalf

The maximum iTileRow is iRowResamp + 8 - 4 = iRowResamp + 4. With the current bounds check (iRowResamp + 4 > inReadableLength, it will skip), the kernel proceeds when iRowResamp = inReadableLength - 4. That gives:

    max iTileRow = (inReadableLength - 4) + 4 = inReadableLength  // out of bounds

The valid row indices are 0 to inReadableLength - 1, so this reads one row past the end of the tile buffer.

I was getting the following error sporadicall when running CUDA-versions
of geo2rdr and gpuResampleSlc

  Reading in image data for tile 38 of 39
  Interpolating  tile 38 of 39
  terminate called after throwing an instance of 'isce3::cuda::except::CudaError<cudaError>'
    what():  Error in file .../cxx/isce3/cuda/core/gpuLUT1d.cu, line 77, function isce3::cuda::core::gpuLUT1d< <template-parameter-1-1>
  >::~gpuLUT1d() [with T = double]: cudaError 700 (an illegal memory access was encountered)
  zsh: IOT instruction (core dumped)

The CPU version in cxx/isce3/image/ResampSlc.cpp:322-327 has the correct bounds check:

      if ((iRowResampled < chipHalf) ||
              (iRowResampled >= (inLength - chipHalf)))
          continue;
      if ((iColResampled < chipHalf) || (
                  iColResampled >= (inWidth - chipHalf)))
          continue;

However, the GPU version has only `>`.
The chip reading loop iterates iChipRow from 0 to chipSize - 1 (0 to 8), accessing row:
    iTileRow = iRowResamp + iChipRow - chipHalf

The maximum iTileRow is iRowResamp + 8 - 4 = iRowResamp + 4.
With the current bounds check (iRowResamp + 4 > inReadableLength → skip), the kernel proceeds when iRowResamp = inReadableLength - 4. That gives:

    max iTileRow = (inReadableLength - 4) + 4 = inReadableLength  ← OUT OF BOUNDS

The valid row indices are 0 to inReadableLength - 1, so this reads one row past the end of the tile buffer.
@Tyler-g-hudson
Copy link
Contributor

Howdy, Scott!

That's an older version of resample that we have since refactored and replaced - this code is currently regarded as legacy/deprecated.
Presently, the most up-to-date version of Resample is at Resample.cu in the same folder as gpuResampSlc.cu, and we have not seen the same errors in this function. The Python interface for this function is richer and can be accessed at python/packages/isce3/image/v2/resample_slc.py under the function resample_slc_blocks or, if you prefer a more direct wrapper to the GPU which requires you to do your own tiling, under python/packages/isce3/cuda/image/v2/resample_slc.py at the gpu_resample_to_coords function.

@scottstanie
Copy link
Member Author

ah, so that's why this hasn't already come up.
Thanks for the explanation! i'll close this.

@scottstanie
Copy link
Member Author

Oh one thing I meant to check (maybe a @hfattahi ): is there a plan to add the phase flattening to the V2 resampler?
I forgot that having the flatten option was the sole reason I was using the older resampSlc version, since i'm using this to make a flattened, coregistered SLC stack (like in https://github.com/opera-adt/COMPASS/blob/main/src/compass/s1_resample.py ).

@hfattahi
Copy link
Contributor

Oh one thing I meant to check (maybe a @hfattahi ): is there a plan to add the phase flattening to the V2 resampler? I forgot that having the flatten option was the sole reason I was using the older resampSlc version, since i'm using this to make a flattened, coregistered SLC stack (like in https://github.com/opera-adt/COMPASS/blob/main/src/compass/s1_resample.py ).

@scottstanie Sure we should probably have that for completeness even though we don't have an immediate use for NISAR. @Tyler-g-hudson almost had the flattening in the original development and got dropped due to lower priority. The range offsets are already there so in theory all needed should be already available to use and flatten the phase after resampling.

@Tyler-g-hudson
Copy link
Contributor

@scottstanie To elaborate on what @hfattahi said, I've got two branches on my fork: One which enables demodulation and remodulation (see #237) and one that is later intended to add flattening to the same here.. The flattening branch is less complete but contains several python bindings to perform flattening on a resampled SLC.

@Tyler-g-hudson
Copy link
Contributor

Tyler-g-hudson commented Mar 16, 2026

Neither of these branches are implemented in GPU, but the CPU functionality is there if it's helpful. The flattening one is untested so there are still some fixes that need to be done on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants