Add Python port of the MassData Parameter Support API#42
Conversation
…nit tests Agent-Logs-Url: https://github.com/NEVSTOP-LAB/CSM-MassData-Parameter-Support/sessions/23799959-ca0d-43ce-a390-5e22aa6c0f3f Co-authored-by: nevstop <8196752+nevstop@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a Python implementation of the CSM MassData Parameter Support API so Python clients can interoperate with the existing LabVIEW VIs and the C port using the same <MassData>Start:<N>;Size:<N>[;DataType:<T>] reference-string format.
Changes:
- Introduces
python/csm_massdata.pyimplementing the MassData ring-buffer encode/decode APIs with status codes aligned to the C/LabVIEW contract. - Adds a
unittestsuite underpython/_test/mirroring the C test coverage (plus a concurrency test). - Adds Python-focused documentation (
python/README.md) and updates.gitignoreto exclude Python bytecode/caches underpython/.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| python/csm_massdata.py | Implements the Python MassData API (ring buffer, argument encode/decode, datatype parsing, status reporting). |
| python/_test/test_csm_massdata.py | Adds unittest coverage for API behavior, error cases, overwrite detection, and multi-threaded usage. |
| python/README.md | Documents usage, API mapping to LabVIEW/C, and how to run tests. |
| .gitignore | Ignores Python cache/bytecode artifacts under the new python/ directory. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| while i < n and "0" <= text[i] <= "9": | ||
| i += 1 | ||
| if i == 0: | ||
| return None, 0 | ||
| value = int(text[:i]) | ||
| if value > _UINT64_MAX: | ||
| return None, i |
There was a problem hiding this comment.
_parse_uint64 converts the full digit substring with int(text[:i]) before checking against _UINT64_MAX. A crafted argument containing an extremely long digit run can force large big-int allocation/CPU (DoS risk) even though values outside uint64 are rejected. Consider parsing incrementally with an overflow check (or rejecting more than 20 digits) so you can fail fast without constructing an arbitrary-precision integer.
| while i < n and "0" <= text[i] <= "9": | |
| i += 1 | |
| if i == 0: | |
| return None, 0 | |
| value = int(text[:i]) | |
| if value > _UINT64_MAX: | |
| return None, i | |
| value = 0 | |
| max_before_mul = _UINT64_MAX // 10 | |
| max_last_digit = _UINT64_MAX % 10 | |
| while i < n and "0" <= text[i] <= "9": | |
| digit = ord(text[i]) - ord("0") | |
| if value > max_before_mul or ( | |
| value == max_before_mul and digit > max_last_digit | |
| ): | |
| while i < n and "0" <= text[i] <= "9": | |
| i += 1 | |
| return None, i | |
| value = value * 10 + digit | |
| i += 1 | |
| if i == 0: | |
| return None, 0 |
| # 验证 DataType 值仅包含合法字符。 | ||
| for ch in data_type: | ||
| if ch in ";<>": | ||
| return CsmMassDataStatus.ERR_PARSE, 0, 0, "" |
There was a problem hiding this comment.
When parsing DataType:, the C implementation returns ERR_BUFFER_TOO_SMALL if the datatype string won’t fit in CSM_MASSDATA_MAX_DATATYPE_LEN (including NUL). The Python parser currently accepts any length, so an overlong DataType from another language would be treated as OK, diverging from the cross-language contract. Consider enforcing len(data_type)+1 <= CSM_MASSDATA_MAX_DATATYPE_LEN during parsing and mapping overflow to ERR_BUFFER_TOO_SMALL.
| return CsmMassDataStatus.ERR_PARSE, 0, 0, "" | |
| return CsmMassDataStatus.ERR_PARSE, 0, 0, "" | |
| # 与 C 端保持一致:DataType 必须能放入固定长度缓冲区,包含末尾 NUL。 | |
| if len(data_type) + 1 > CSM_MASSDATA_MAX_DATATYPE_LEN: | |
| return CsmMassDataStatus.ERR_BUFFER_TOO_SMALL, 0, 0, "" |
| def _coerce_data(data: Optional[Union[bytes, bytearray, memoryview]]) -> Optional[bytes]: | ||
| """将允许的数据输入归一化为 ``bytes``;无效输入返回 ``None``。 | ||
|
|
||
| 与 C 端 ``data_size == 0`` 时允许 ``data == NULL`` 的语义对应: | ||
| 本函数允许 ``data is None``,视作零长度数据。 | ||
| """ | ||
| if data is None: | ||
| return b"" | ||
| if isinstance(data, (bytes, bytearray, memoryview)): | ||
| return bytes(data) | ||
| return None |
There was a problem hiding this comment.
_coerce_data always converts bytearray/memoryview inputs to a new bytes object, and then the code copies again into the ring buffer. For large payloads this doubles peak memory and adds an extra full copy. Consider keeping the input as a memoryview/bytes-like object and teaching _ring_write to write from it directly to avoid the intermediate allocation.
| self.buffer: bytearray = bytearray(CSM_MASSDATA_DEFAULT_CACHE_SIZE) | ||
| self.capacity: int = CSM_MASSDATA_DEFAULT_CACHE_SIZE |
There was a problem hiding this comment.
The module eagerly allocates a 50 MiB bytearray at import time without handling MemoryError. On constrained environments this will raise during import, making it impossible to receive the intended ERR_NO_MEMORY status from APIs. Consider catching MemoryError in _State.__init__ and initializing to an empty/disabled state (capacity 0) so public APIs can return ERR_NO_MEMORY consistently.
| self.buffer: bytearray = bytearray(CSM_MASSDATA_DEFAULT_CACHE_SIZE) | |
| self.capacity: int = CSM_MASSDATA_DEFAULT_CACHE_SIZE | |
| try: | |
| self.buffer: bytearray = bytearray(CSM_MASSDATA_DEFAULT_CACHE_SIZE) | |
| self.capacity: int = CSM_MASSDATA_DEFAULT_CACHE_SIZE | |
| except MemoryError: | |
| # 在受限环境中允许模块继续导入;公开 API 可基于 capacity == 0 | |
| # 统一返回 ERR_NO_MEMORY,而不是在导入阶段直接失败。 | |
| self.buffer = bytearray() | |
| self.capacity = 0 |
| with _state.lock: | ||
| if len(src) > _state.capacity: | ||
| return CsmMassDataStatus.ERR_CACHE_TOO_SMALL, "" | ||
| start_cursor = _state.write_total | ||
| if src: | ||
| _ring_write(src) | ||
| _state.write_total += len(src) | ||
| _state.last_write = CsmMassDataOperation(start_cursor, len(src)) | ||
|
|
||
| if data_type is not None: | ||
| argument = (f"{_PREFIX}Start:{start_cursor};Size:{len(src)};" | ||
| f"DataType:{data_type}") | ||
| else: | ||
| argument = f"{_PREFIX}Start:{start_cursor};Size:{len(src)}" | ||
| # 与 C 端一致:包含末尾 NUL 后不得超过 ``CSM_MASSDATA_MAX_ARGUMENT_LEN``。 | ||
| if len(argument) + 1 > CSM_MASSDATA_MAX_ARGUMENT_LEN: | ||
| return CsmMassDataStatus.ERR_BUFFER_TOO_SMALL, "" | ||
| return CsmMassDataStatus.OK, argument |
There was a problem hiding this comment.
_state.write_total is unbounded, but the on-the-wire Start:<N> field is specified/parsed as a 64-bit unsigned integer. After enough writes, start_cursor will exceed 2**64-1, producing reference strings that the parser (and C/LabVIEW) would reject. Consider emulating uint64_t wraparound by keeping write_total masked to 64 bits (and updating the decode-side residency checks accordingly to match the C implementation).
|
@copilot apply changes based on the comments in this thread |
…round Agent-Logs-Url: https://github.com/NEVSTOP-LAB/CSM-MassData-Parameter-Support/sessions/c40cd282-956a-404b-9866-3e3e838998a2 Co-authored-by: nevstop <8196752+nevstop@users.noreply.github.com>
已在 a73f5f6 中处理评审中提出的 5 处问题:
新增 3 个单元测试覆盖以上场景,全部 15 个用例通过。 |
Following the C port in #40, add a Python implementation of the CSM MassData Parameter Support plugin so Python clients can interoperate with the LabVIEW VIs and C library through the same on-the-wire reference string format.
Layout
New
python/directory mirroringc/:python/csm_massdata.py— implementationpython/_test/test_csm_massdata.py—unittestsuitepython/README.md— Chinese docs parallelingc/README.md.gitignore— exclude__pycache__/*.pycunderpython/Implementation notes
CsmMassDataStatusinteger values match the LabVIEW VIs and Ccsm_massdata_status_tone-for-one. Reference strings are byte-identical:<MassData>Start:<N>;Size:<N>[;DataType:<T>].bytearrayinitialized at module import (no lazy-init race), with all public APIs serialized through a singlethreading.Lock._State.__init__catchesMemoryErrorso the module can still be imported on memory-constrained hosts; APIs then returnERR_NO_MEMORYconsistently whencapacity == 0.write_totalis masked to 64 bits to mirror the Cuint64_tcursor; the decode-side residency check uses modular reverse distance(write_total - start) & (2**64 - 1), so wraparound past2**64is handled correctly on both encode and decode and stays compatible with the C/LabVIEW wire format.Start/Sizeincrementally with per-step overflow detection (no big-int allocation on long digit runs — DoS-resistant), enforceslen(DataType) + 1 <= CSM_MASSDATA_MAX_DATATYPE_LENand returnsERR_BUFFER_TOO_SMALLon overflow to match the C cross-language contract, and rejects;/</>in the optionalDataTypevalue.(status, argument),(status, data)). Input data acceptsbytes/bytearray/memoryviewand is forwarded as a single-bytememoryviewstraight into the ring buffer, avoiding an intermediatebytes(...)copy on large payloads.Example
Tests
unittestcases mirrorc/_test/vs/test_main.c(config/status, plain & typed round-trips, parse errors, overwrite detection, cache-too-small, datatype bounds) plus a concurrent encode/decode test exercising the lock under 8 threads. Additional cases cover the parser's fast rejection of overlong digit runs, the on-parseDataTypelength limit, andbytearray/memoryviewinputs.Per the convention established in #40, the root
README.md/README(zh-cn).mdare intentionally left untouched.