Skip to content

feat(serial): Serial 전송에 keep-alive 및 진단 기능 추가, driver race 수정#333

Closed
hakueon wants to merge 29 commits into
scratchfoundation:developfrom
aluxrobot:feature/serial-transport
Closed

feat(serial): Serial 전송에 keep-alive 및 진단 기능 추가, driver race 수정#333
hakueon wants to merge 29 commits into
scratchfoundation:developfrom
aluxrobot:feature/serial-transport

Conversation

@hakueon
Copy link
Copy Markdown

@hakueon hakueon commented May 25, 2026

feat(serial): Serial 전송에 keep-alive 및 진단 기능 추가, driver race 수정

개요

  • Serial 전송에 대기 중 장치 타임아웃을 방지하는 keep-alive 기능과 개발 진단 옵션을 추가했습니다.
  • .NET 8 + CH340/CP210x 드라이버 조합에서 발생하는 Read/Write race condition을 수정하여 무선 DFU 신뢰성을 개선했습니다.

주요 변경 사항

1. Serial Transport Keep-Alive 및 진단 기능

  • 연결 매개변수 추가:
    • peripheralType (문자열, 선택): 장치 클래스 식별 ("codetinker", "connect", "technic" 등)
    • keepAliveIntervalMs (정수, 선택): idle 상태에서 마지막 TX 패킷을 이 간격으로 재전송하여 장치 타임아웃 방지
    • wireTrace (boolean, 선택): 전송 수준 디버깅을 위한 hex dump 활성화
  • 런타임 RPC 추가:
    • setKeepAlive { intervalMs }: 클라이언트에서 firmware update 전후로 keep-alive를 동적으로 토글 가능
  • 스레드 안전성:
    • SemaphoreSlim으로 HandleWrite와 keep-alive tick 직렬화
    • Timer의 async-void callback이 skip-on-busy 의미론으로 작동하여 write burst 중 자동 억제
    • 구조화된 Trace 로그로 keep-alive 동작 추적

2. SerialPort Read/Write Driver Race Condition 수정

  • 원인: .NET 8 SerialPort + CH340/CP210x 드라이버에서 Write 진행 중 blocking Read 시 spurious TimeoutException 발생
  • 증상: keep-alive 33ms에서 ~1.4s 후 클라이언트 연결 해제, 무선 DFU 실패
  • 수정:
    • ReadLoop: SerialPort.BytesToRead 폴링으로 데이터 있을 때만 Read 호출 (blocking timeout 제거)
    • DoWrite: BaseStream.WriteAsync → 동기식 SerialPort.Write로 변경 (같은 cache layer 통과)
    • ioLock으로 Read/Write 직렬화, ThreadPool dispatch로 비동기 시그니처 유지
    • idle 중 WaitHandle 대기로 cosmetic 500ms TimeoutException 제거
  • 검증: 유선/무선 DFU 모두에서 TimeoutException burst 제거, bootloader 진입 성공, firmware 후 재연결 정상화

3. Alux Fork 전용 문서 추가

  • Documentation/Alux/ 아래 Serial 가이드 추가

4. 프로젝트 설정 및 가이드 정리

  • CLAUDE.md 추가 (코드 작성 규칙)
  • .gitignore에 ref/ 디버그 캡처 추가
  • 주석 정리 (CLAUDE.md 가이드 준수)

hakueon and others added 29 commits May 22, 2026 12:36
Introduce design documents for a new USB Serial transport extension
on scratch-link 2.x. SerialTransport.md describes the server side:
a /scratch/serial JSON-RPC endpoint on ws://localhost:20111, targeting
CH340 (VID 0x1A86 / PID 0x7523) on Windows first via
System.IO.Ports.SerialPort and WMI-based VID/PID enumeration.
scratch-link-fork-plan.md is the aluxcoding-scratch client-side
counterpart aligned to the same wire contract.

Uses Serial-specific RPC vocabulary (serialDidReceiveData,
serialDidDisconnect, startReading/stopReading) so callers cannot
confuse Serial events with BLE or BT events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the BLE/BT session pattern with a /scratch/serial JSON-RPC
endpoint. CH340 (VID 0x1A86 / PID 0x7523) is first-class via WMI-based
VID/PID enumeration; other CDC devices work through the same path with
appropriate discovery filters.

Why a separate listener port (20211): this fork is meant to coexist
with the stock scratch-link on user machines (e.g. existing Scratch 3.0
users), so the default 20110/20111 cannot be reused.

Why sync SerialPort.Read on a Task-Run thread instead of
BaseStream.ReadAsync: .NET 6's SerialStream does not override the
async read overloads in a way that survives CH340's open-time DTR/RTS
toggling; the first ReadAsync reliably throws ERROR_OPERATION_ABORTED
even with the cancellation token unbound. Synchronous Read with a
500ms ReadTimeout + TimeoutException-as-loop-tick matches the working
test_cs reference and lets shutdown be observed via the CTS.

DtrEnable/RtsEnable are pinned low explicitly because some firmware
(codetinker on CH340) treats DTR/RTS transitions as reset signals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets fork identity across assembly/package metadata, manifests, tray
UI, and icon assets so users can distinguish this binary from the
stock scratch-link they may have installed in parallel (see the port
20211 split in the Serial transport commit).

ScratchVersion.targets gains a floor of 1.0.0 when GitInfo finds no
semver tag, so an untagged build ships as "Alux Scratch Link 1.0.0.x"
instead of "0.0.x". Real releases are still cut by tagging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SerialTransport.md and scratch-link-fork-plan.md moved to a sibling
documents/ folder outside this repo so the fork's design notes can be
edited alongside the related client-side work without churning this
tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
UpgradeLog*.htm, _UpgradeReport_Files/, and Backup*/ are produced by
VS's project upgrade flow and are IDE-local — they should not enter
the repo when someone happens to open the solution after a toolchain
bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds brand/alux-l.svg as the source of truth for app/tray/MSIX icons,
plus brand/build_icons.py to regenerate them. Generated files are
already committed; the script only needs to run when the SVG changes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rebrands the title and removes upstream-only content (macOS/Safari,
semantic-release, CFBundle notes). Adds a fork notice with the AGPL-3.0
attribution to scratchfoundation/scratch-link, documents the Serial
transport and the coexistence port (20211), and points to
brand/build_icons.py for icon regeneration. Translated to Korean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Captures the VS 2026-specific workload names, the missing .NET 6 SDK
and Windows App Runtime 1.3 manual installs, and the startup-project
gotcha (scratch-link-win, not the wapproj) that causes MddBootstrap
to fail on F5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces brand/alux-l.svg with brand/labs-l.svg as the icon source of
truth and regenerates all derived ICO/PNG assets. Wrong source file
was committed earlier.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
feat(serial-transport): USB Serial 장치 지원 추가 및 포크 재브랜딩
- TargetFramework net6.0 → net8.0-windows10.0.22621.0
- RuntimeIdentifiers win10-* → win-* (.NET 8 RID 정책 변경)
- WindowsAppSDK 1.3.230331000 → 1.5.240311000
- SDK.BuildTools 10.0.22621.756 → 10.0.22621.3233
- System.IO.Ports 6.0.0 → 8.0.0, System.Management 6.0.2 → 8.0.0
- PublishProfiles win10-* → win-* 파일 이름 및 RID 일치
- wapproj AssetTargetFallback net6 → net8, publish profile 경로 수정

Win App Runtime 1.3 별도 설치 불필요 (1.5+ 이미 Windows에 탑재)
.NET 6 SDK EOL 별도 설치 불필요 (SDK 10.x에서 net8.0 직접 빌드)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
WinUI 3 requires both Windows 8 and Windows 10 supportedOS GUIDs.
Without the Win 10 GUID, the OS may run the app in Win 8 compat mode,
causing incorrect DPI/input/WinRT behavior on Windows 10 1809+ targets.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moved to external documents directory. No longer tracked in the repo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Debug_Win: WindowsAppSDKSelfContained=true so F5 works on a fresh dev
machine without pre-installing Windows App Runtime 1.5.

Release_Win (MSIX): publish profiles override to false so the MSIX
declares a runtime dependency and end-user installers stay lean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous condition only matched 'Debug_Win' but VS default config is
'Debug', so fresh builds got SelfContained=false and showed the
Windows App Runtime install dialog on every first run.

Setting true unconditionally in csproj; all publish profiles already
override to false for MSIX packaging so release builds are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Windows-only fork does not need Debug_Mac, Release_Mac,
Release_DevID_Mac, Release_MAS_Mac configurations or the
scratch-link-mac project entry. Drops the VS config dropdown
from 6 configs to 2 (Debug_Win, Release_Win).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
VS 2026 MSBuild resolves $(NuGetPackageRoot) differently from the CLI,
causing the Exists() condition on the GitInfo.targets import to fail
silently. This leaves GitVersion undefined and breaks the build.

Add a fallback GitVersion target in ScratchVersion.targets that sets
safe defaults (SemVer 0.0.0, triggering the 1.0.0 floor). GitInfo's
own GitVersion is imported later via nuget.g.targets and overrides
this fallback when the package loads correctly ("last definition wins").

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AnyCPU platform does not support self-contained mode and throws
'The platform AnyCPU is not supported for Self Contained mode'.

Self-contained (bundles App Runtime DLLs) is now enabled only when
building with x64, x86, or ARM64. AnyCPU falls back to false, which
is fine since the wapproj always builds with a specific platform anyway.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bumps Microsoft.WindowsAppSDK from 1.5.240311000 to 1.8.260508005 and
Microsoft.Windows.SDK.BuildTools from 10.0.22621.3233 to 10.0.28000.1839.
Windows App Runtime 1.8 is more widely pre-installed on Windows 11 via
Windows Update, eliminating the runtime-install dialog on most systems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updates README to reflect actual project structure:
- .NET 8 / WindowsAppSDK 1.8 (was .NET 6 / 1.2/1.3)
- Repo directory tree with all components
- Accurate system requirements (WAR 1.8, Win 10 1809+)
- Dev setup quick-start pointing to WindowsDevSetup-VS2026.md
- Build configuration table (Debug_Win / Release_Win)
- Corrected packaging section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename package.json name/author/version to ALUX, Inc. / 1.0.0
- Update csproj and appxmanifest Company/PublisherDisplayName to ALUX, Inc.
- Update README: AluxLabs branding, Alux product extensions, repo link to dev guide
- Move dev setup guide into Documentation/ for proper git tracking

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Upgrade .NET 6 → .NET 8, WindowsAppSDK 1.3 → 1.8
- Fix RID naming for .NET 8 (win10-* → win-*)
- Add fallback GitVersion target for VS 2026 MSBuild compatibility
- Add Windows 10 compatibility GUID to app.manifest
- Update project metadata to ALUX, Inc.
- Move dev setup guide into Documentation/
- Rewrite README for current project state
Group all fork-specific documentation under a dedicated subfolder so
that upstream scratchfoundation/scratch-link docs and Alux fork docs
are visually and structurally distinguishable. This also lets upstream
syncs ignore Documentation/Alux/ as a whole.

Contents:
- SerialApiReference.md: complete JSON-RPC API for the serial transport
  (discover, connect, write, startReading, stopReading, setKeepAlive,
  disconnect; notifications; recovery policy; pathHint matching
  semantics).
- SerialKeepAliveGuide.md: explains the codetinker 1s-timeout problem,
  the keep-alive timer design, the DFU-safe interaction model
  (write-time reset, setKeepAlive toggle), and the wireTrace
  diagnostic option.
- WindowsDevSetup-VS2026.md: moved into Alux/ alongside the new guides.

README links the new folder so upstream-vs-fork doc origin is obvious.
… setKeepAlive, and wireTrace diagnostic

Adds three connect-time options and one runtime RPC to the serial
transport so that wireless devices like Codetinker — which require
"<" 1s between TX packets or they signal a timeout alarm — can be
driven safely while still allowing transports that have to send
arbitrary byte streams (e.g. firmware update) to suppress the
keep-alive on demand.

Connect parameters:
- peripheralType (string, optional): identifies the device class
  ("codetinker", "connect", "technic", …). Logged at connect time;
  reserved for future per-device policy.
- keepAliveIntervalMs (int, optional): if set, the most recently sent
  TX packet is re-sent at this interval to keep the device from
  timing out during idle periods. Null or non-positive disables.
- wireTrace (bool, optional): emits per-write/per-read hex dumps via
  Trace.WriteLine for transport-level debugging. Off by default.

Runtime RPC:
- setMode-style toggle via setKeepAlive { intervalMs }: lets the
  client disable keep-alive before a firmware update (intervalMs=null)
  and re-enable it after (intervalMs=33). The cached last-TX packet
  is preserved across the toggle, the call is idempotent, and
  stopping blocks on any in-flight tick so no resend ever races
  with a disconnect.

Thread-safety:
- SemaphoreSlim serializes HandleWrite vs. keep-alive ticks so two
  DoWrite invocations never overlap and corrupt the stream.
- The keep-alive timer fires an async-void callback with skip-on-busy
  semantics (WaitAsync(0)), making it idle-only — write bursts (DFU
  chunks) automatically suppress resends until the line goes idle.
- HandleWrite resets the timer on every write so the resend never
  fires mid-burst even at the boundaries.
- StopKeepAlive uses Timer.Dispose(WaitHandle) to block until the
  current tick finishes; ResetKeepAliveTimer swallows
  ObjectDisposedException for the race window with stop.
- Dispose(bool) is overridden to tear down the timer and dispose the
  semaphore in the correct order; platform subclasses still close
  their port handle.

Structured Trace logs for start, stop, double-start, and resend
failure surface keep-alive behaviour in production logs.
… to eliminate driver race

.NET 8 SerialPort + CH340/CP210x drivers throw spurious TimeoutException
on the read side whenever a Write fires while a blocking Read is in
flight, regardless of whether the Write uses BaseStream.WriteAsync or
the synchronous SerialPort.Write. Symptom in production: with
keep-alive on at 33ms, the burst comes at the write cadence and the
client disconnects after ~1.4s, breaking wireless DFU. Diagnosed by
disabling keep-alive — the burst disappears and only normal 500ms idle
timeouts remain (caught and ignored).

Fix: guarantee Read and Write are never in flight at the same time on
the same handle.

ReadLoop:
- Poll SerialPort.BytesToRead (a cheap status query that does not hold
  the I/O surface) and only call Read when bytes are actually
  available, so Read returns immediately rather than blocking on a
  timeout.
- Read is wrapped in ioLock so it never overlaps DoWrite.
- When idle, wait on the cancellation token's WaitHandle for ~10ms
  instead of issuing a blocking Read; this also removes the cosmetic
  500ms idle TimeoutException spam previously seen in the debugger.

DoWrite:
- Switch from BaseStream.WriteAsync to the synchronous SerialPort.Write
  so Read and Write travel through the same SerialPort cache layer
  rather than mixing BaseStream and cache-layer APIs.
- Wrap the synchronous Write in ioLock and dispatch on the ThreadPool
  via Task.Run to preserve the async signature.
- Re-check IsOpen inside the lock; catch InvalidOperationException
  (which the synchronous Write can throw on a closed port) and map it
  to the existing invalid-request error path.

Because ReadLoop only acquires ioLock when data is present, the lock
hold time is dominated by the immediate-return Read; keep-alive
writes at 33ms see sub-millisecond contention.

Verified in production with both wired and wireless (USB dongle) DFU:
the TimeoutException burst is gone, bootloader entry succeeds, and
post-firmware reinit reconnects cleanly.
upstream 자동 생성 변경 이력 파일을 제거하고,
이 프로젝트에 맞는 Claude 행동 규칙 문서(CLAUDE.md)를 추가한다.
Personal debug captures (VS DebugView output, browser console logs,
etc.) live under ref/ during diagnosis but should never be committed.
Excluding the whole directory so accidental 'git add -A' won't pull
them in.
본 세션에서 추가/수정한 Serial 관련 코드의 주석을 CLAUDE.md §4 규칙에
맞춰 정리한다.

- 공용 코드(scratch-link-common)에서 특정 프로토콜/디바이스 언급 제거:
  SerialOpenParams.PeripheralType 문서, StartKeepAlive 문서에서 "codetinker"
  같은 device-specific 예시를 빼고 일반화된 표현으로 바꾼다.
- 일반 주석(//)을 한 줄로 축약: 멀티라인 설계 의도 설명은 commit log /
  문서로 옮기고, 인라인 주석은 WHY 한 줄만 남긴다 (writeSemaphore/stateLock/
  wireTrace 필드 주석, HandleSetKeepAlive idempotent 주석, StopKeepAlive
  block 주석, Dispose 주석, OnKeepAliveTick idle-only/re-check/escape 주석,
  WinSerialSession ioLock/DoWrite/ReadLoop polling 주석 등).
- XML doc 단순화: 자명한 WHAT 설명과 caller 참조("Called from X")를
  제거하고, 비자명한 동작·제약·사이드이펙트만 단일 <summary> 로 남긴다.
  StyleCop SA1611/SA1615 충족을 위해 <param>/<returns>는 짧게 유지.

동작/API 변경 없음. 빌드 클린(우리 코드 관련 새 warning 없음).
@github-actions
Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@hakueon hakueon closed this May 25, 2026
@github-actions github-actions Bot locked and limited conversation to collaborators May 25, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant