Description
When loading a FAST binary file using FASTOutputFile from the openfast_toolbox.io module and converting it to a DataFrame, a dimension mismatch error occurs in non-buffered mode. The error originates from openfast_toolbox/io/fast_output_file.py during the data scaling step.
Error Trigger Code:
from openfast_toolbox.io import FASTOutputFile
out_file = FASTOutputFile("test.outb").toDataFrame() # Fails here
Error Message:
ValueError: operands could not be broadcast together with shapes (NT, NumOutChans) (NumOutChans, 1)
Affected File:
openfast_toolbox/io/fast_output_file.py
Steps to Reproduce
- Import
FASTOutputFile and load a FAST binary file:
from openfast_toolbox.io import FASTOutputFile
out_file = FASTOutputFile("test.outb").toDataFrame()
- Ensure the file is in a compressed format (e.g.,
FileFmtID_WithTime or FileFmtID_WithoutTime).
- The error occurs during the data scaling step in non-buffered mode (
use_buffer=False).
Root Cause
In fast_output_file.py, the scaling arrays ColOff and ColScl are incorrectly shaped as column vectors ((NumOutChans, 1)), while the data array data has shape (NT, NumOutChans). This violates NumPy broadcasting rules when performing element-wise operations:
# In fast_output_file.py (non-buffered mode):
data = (data - ColOff) / ColScl # Shapes: (NT,138) vs (138,1)
• Buffered mode works because it uses 1D arrays and applies scaling column-by-column.
Proposed Fix
Adjust the dimensions of ColOff and ColScl to align with broadcasting rules.
Option 1: Flatten to 1D Arrays
Modify the code in fast_output_file.py to convert ColOff and ColScl to 1D arrays:
# Before (line ~X in fast_output_file.py):
ColScl = fread(fid, NumOutChans, 'float32') # Shape: (138, 1)
ColOff = fread(fid, NumOutChans, 'float32') # Shape: (138, 1)
# After:
ColScl = fread(fid, NumOutChans, 'float32').flatten() # Shape: (138,)
ColOff = fread(fid, NumOutChans, 'float32').flatten() # Shape: (138,)
Option 2: Transpose Scaling Arrays
Alternatively, transpose ColOff and ColScl to row vectors:
data = (data - ColOff.T) / ColScl.T # Shapes: (NT,138) vs (1,138)
Why Buffered Mode Works
In buffered mode (use_buffer=True):
• ColOff and ColScl are 1D arrays (shape (NumOutChans,)).
• Scaling is applied column-wise in a loop, avoiding broadcasting:
for iCol in range(NumOutChans):
data[:, iCol+1] = (data[:, iCol+1] - ColOff[iCol]) / ColScl[iCol]
Impact
Affected Users: Anyone using FASTOutputFile.toDataFrame() in non-buffered mode.
Description
When loading a FAST binary file using
FASTOutputFilefrom theopenfast_toolbox.iomodule and converting it to a DataFrame, a dimension mismatch error occurs in non-buffered mode. The error originates fromopenfast_toolbox/io/fast_output_file.pyduring the data scaling step.Error Trigger Code:
Error Message:
Affected File:
openfast_toolbox/io/fast_output_file.pySteps to Reproduce
FASTOutputFileand load a FAST binary file:FileFmtID_WithTimeorFileFmtID_WithoutTime).use_buffer=False).Root Cause
In
fast_output_file.py, the scaling arraysColOffandColSclare incorrectly shaped as column vectors ((NumOutChans, 1)), while the data arraydatahas shape(NT, NumOutChans). This violates NumPy broadcasting rules when performing element-wise operations:• Buffered mode works because it uses 1D arrays and applies scaling column-by-column.
Proposed Fix
Adjust the dimensions of
ColOffandColSclto align with broadcasting rules.Option 1: Flatten to 1D Arrays
Modify the code in
fast_output_file.pyto convertColOffandColSclto 1D arrays:Option 2: Transpose Scaling Arrays
Alternatively, transpose
ColOffandColSclto row vectors:Why Buffered Mode Works
In buffered mode (
use_buffer=True):•
ColOffandColSclare 1D arrays (shape(NumOutChans,)).• Scaling is applied column-wise in a loop, avoiding broadcasting:
Impact
Affected Users: Anyone using
FASTOutputFile.toDataFrame()in non-buffered mode.