Background
internal/session/ 模块包含 8 个源文件(~3,952 行),负责 5 状态机会话生命周期、SQLite 持久化、每用户配额管理和确定性 session key 生成。模块已在 issue 243 中完成 testability/code-quality 分析。本次为 首次从 SOLID + DRY + coupling 角度的系统性分析,发现 6 个独立于已有 issue 的架构问题。
Scope: solid, dry, coupling — cycle 2 (模块分析通过 1)
Key files: manager.go, pool.go, store.go, key.go
Finding Summary
| Category |
Critical |
High |
Medium |
Low |
| SOLID/SRP |
0 |
1 |
1 |
0 |
| SOLID/DIP |
0 |
1 |
0 |
0 |
| DRY |
0 |
0 |
2 |
0 |
| Coupling |
0 |
0 |
1 |
0 |
| 合计 |
0 |
2 |
4 |
0 |
Findings
SOLID: SRP
manager-mixed-responsibilities
Severity: High | Confidence: High | ROI: High
Location: manager.go (1,025 lines), Manager struct (lines 36-52)
Problem: Manager 承担 5+ 职责:(1) session CRUD + 缓存, (2) 5 状态机转换, (3) worker 生命周期管理 (attach/detach/terminate), (4) 后台 GC (zombie/max_lifetime/idle), (5) metrics 发射。其中 GC 逻辑(runGC + gc + ResetGCInterval, lines 841-969)共 128 行,是独立的定时任务调度,完全可以从 Manager 中提取。
transitionState (73 行) 混合了 6 个关注点:内存状态更新、idle 过期设置、DB 持久化 + rollback、worker 终止、metrics 更新、通知回调。尤其是 worker 终止块(lines 269-294)在 session 层直接调用 worker.Terminate() + metrics.WorkerExecDuration,违反了层次边界。
Current Pattern:
// manager.go:244-316 — transitionState handles 6 concerns
func (m *Manager) transitionState(ms *managedSession, to events.SessionState, reason string) error {
// 1. Update in-memory state
from := ms.info.State
ms.info.State = to
// 2. Set idle expiry
if to == events.StateIdle { ms.info.IdleExpiresAt = ... }
// 3. DB persist + rollback on failure
rollback, err := m.updateSession(ms)
// 4. Worker termination — session layer calling into worker layer!
if to == events.StateTerminated || to == events.StateDeleted {
if ms.worker != nil {
go func() { ms.worker.Terminate(base.GracefulShutdownTimeout) }()
m.releaseWorkerQuota(ms) // pool interaction
metrics.WorkerExecDuration.Observe(...) // metrics
}
}
// 5. Metrics emission
metrics.SessionsActive.WithLabelValues(...).Dec()
// 6. Notification callback
if m.StateNotifier != nil { m.StateNotifier(...) }
}
Proposed Fix: 分阶段提取:
// Step 1: Extract GC into SessionGC
type SessionGC struct {
store Store
sessions func() map[string]*managedSession // read-only access
mu *sync.RWMutex
interval time.Duration
// ...
}
// Step 2: Extract worker lifecycle from transitionState
func (m *Manager) handleWorkerOnTransition(ms *managedSession, to events.SessionState) {
if ms.worker == nil { return }
if to == events.StateTerminated || to == events.StateDeleted {
go ms.worker.Terminate(base.GracefulShutdownTimeout)
m.releaseWorkerQuota(ms)
}
}
// Step 3: Extract metrics helper
func (m *Manager) recordTransition(from, to events.SessionState) {
metrics.SessionsActive.WithLabelValues(string(from)).Dec()
metrics.SessionsActive.WithLabelValues(string(to)).Inc()
}
Estimated Impact: manager.go 减少 ~150 行,GC 和 worker lifecycle 可独立测试
Acceptance Criteria:
SOLID: DIP
concrete-poolmanager-dependency
Severity: High | Confidence: High | ROI: Medium
Location: manager.go:44 (pool *PoolManager), pool.go (no interface)
Problem: Manager 依赖具体 *PoolManager 而非接口。这导致:
- 测试 Manager 必须构造真实 PoolManager(虽然简单,但限制了 mock 能力)
- 无法替换为 no-op pool 或分布式 quota 实现
- PoolManager 的 metrics 耦合传导到 Manager 测试中
Current Pattern:
// manager.go:44 — concrete dependency
type Manager struct {
pool *PoolManager // concrete type, no interface
// ...
}
// pool.go — no interface defined
type PoolManager struct {
mu sync.Mutex
totalCount int
userCount map[string]int
// ...
}
Proposed Fix: 定义 Pool 接口并在 Manager 中使用:
type Pool interface {
Acquire(userID string) error
Release(userID string)
AcquireMemory(userID string, mem int64) error
ReleaseMemory(userID string, mem int64)
Stats() PoolStats
UpdateLimits(maxSize int, maxPerUser int)
}
type PoolStats struct {
Total, MaxTotal int
PerUser map[string]int
}
Estimated Impact: Manager 可使用 mock Pool 测试,解锁替代 quota 实现
Acceptance Criteria:
DRY
metrics-emission-pattern-repeated
Severity: Medium | Confidence: High | ROI: High
Location: manager.go:transitionState (lines 289-294), AttachWorker (lines 447-449), detachWorkerUnchecked (lines 481-486), Delete (lines 556-558), gc (lines 935-937)
Problem: metrics.SessionsActive.WithLabelValues(string(state)).Dec() / .Inc() 模式在 5+ 个位置重复,每个位置还伴随 metrics.WorkersRunning 和 metrics.WorkerExecDuration 的更新。新增状态或修改 metrics 标签需要修改所有位置。
Current Pattern:
// Repeated in transitionState, AttachWorker, detachWorkerUnchecked, Delete, gc:
metrics.SessionsActive.WithLabelValues(string(from)).Dec()
metrics.SessionsActive.WithLabelValues(string(to)).Inc()
metrics.WorkersRunning.WithLabelValues(wt).Dec()
metrics.WorkerExecDuration.WithLabelValues(wt).Observe(elapsed.Seconds())
Proposed Fix:
func recordStateTransition(from, to events.SessionState) {
metrics.SessionsActive.WithLabelValues(string(from)).Dec()
metrics.SessionsActive.WithLabelValues(string(to)).Inc()
}
func recordWorkerDone(wt string, elapsed time.Duration) {
metrics.WorkersRunning.WithLabelValues(wt).Dec()
metrics.WorkerExecDuration.WithLabelValues(wt).Observe(elapsed.Seconds())
}
Estimated Impact: ~25 行减少,metrics 变更只改 helper
Acceptance Criteria:
lock-boilerplate-repeated-fifteen-times
Severity: Medium | Confidence: High | ROI: Medium
Location: manager.go — Get, Transition, TransitionWithInput, AttachWorker, GetWorker, detachWorkerUnchecked, Delete, DeletePhysical, ClearContext, UpdateWorkerSessionID, DebugSnapshot, ListActive, WorkerHealthStatuses, TerminateAllWorkers, Lock, Unlock
Problem: "Manager.mu.RLock → lookup sessions map → managedSession.mu.RLock/Lock → operate → unlock" 样板代码重复约 15 次。getManagedSession helper 已集中了 lookup + cache-miss 逻辑,但锁获取/释放的 ordering boilerplate 在每个方法中仍重复。
Current Pattern:
// Repeated ~15 times across Manager methods:
func (m *Manager) SomeMethod(ctx context.Context, id string, ...) error {
m.mu.RLock()
ms, ok := m.sessions[id]
m.mu.RUnlock()
if !ok {
ms = m.getManagedSession(ctx, id)
if ms == nil { return ErrSessionNotFound }
}
ms.mu.Lock()
defer ms.mu.Unlock()
// ... actual logic (often 5-10 lines)
}
Proposed Fix: 提供 withSession 辅助方法统一锁获取:
func (m *Manager) withSession(ctx context.Context, id string, write bool, fn func(*managedSession) error) error {
ms := m.getManagedSession(ctx, id)
if ms == nil { return ErrSessionNotFound }
if write {
ms.mu.Lock()
defer ms.mu.Unlock()
} else {
ms.mu.RLock()
defer ms.mu.RUnlock()
}
return fn(ms)
}
Estimated Impact: ~60 行锁样板减少,统一锁 ordering 保证
Acceptance Criteria:
Coupling
nil-guard-pattern-repeated-eight-times
Severity: Medium | Confidence: Medium | ROI: Medium
Location: manager.go:329, 354, 410, 460, 494, 600, 628, 652
Problem: 8 个公共方法以 if m == nil { return ErrSessionNotFound } 开头。这种模式表明 Manager 可能在初始化完成前被调用,或作为可选参数传递。nil-guard 本身是防御性编程,但重复 8 次意味着调用约定不够清晰。
Current Pattern:
// Lines 329, 354, 410, 460, 494, 600, 628, 652:
func (m *Manager) SomeMethod(...) error {
if m == nil {
return ErrSessionNotFound
}
// ...
}
Proposed Fix: 选项 A — 确保 Manager 在所有调用路径上都已初始化,移除 nil-guard。选项 B — 提供 SafeManager wrapper:
type SafeManager struct { *Manager }
func (m SafeManager) Transition(...) error {
if m.Manager == nil { return ErrSessionNotFound }
return m.Manager.Transition(...)
}
Estimated Impact: 消除 8 处重复防御性检查,明确调用约定
Acceptance Criteria:
Implementation Priority
| Finding |
Priority |
Effort |
Risk |
Impact |
| metrics-emission-pattern-repeated |
P1 |
Small |
Low |
~25 行减少,统一 metrics 调用 |
| manager-mixed-responsibilities |
P2 |
Large |
Medium |
~150 行减少,GC/worker 可独立测试 |
| lock-boilerplate-repeated |
P2 |
Medium |
Low |
~60 行减少,统一锁 ordering |
| concrete-poolmanager-dependency |
P3 |
Medium |
Low |
解锁 mock Pool 测试 |
| nil-guard-pattern-repeated |
P3 |
Small |
Low |
明确调用约定 |
| poolmanager-metrics-coupling |
P4 |
Small |
Low |
可选 metrics(Low ROI 已丢弃) |
Recommended starting point: metrics-emission-pattern-repeated — 最小改动,高回报,消除 5+ 处重复
Out of Scope
getManagedSession context.Background() — 已在 issue 243 跟踪
- Phantom mock DeleteExpiredEvents — 已在 issue 243 跟踪
- GC decomposition — 已在 issue 243 跟踪 (code-quality aspect)
DerivePlatformSessionKey switch on platform type — Low ROI, 小函数
- PoolManager metrics coupling — Low ROI(可从 concrete-poolmanager-dependency 顺便解决)
Verification
Background
internal/session/模块包含 8 个源文件(~3,952 行),负责 5 状态机会话生命周期、SQLite 持久化、每用户配额管理和确定性 session key 生成。模块已在 issue 243 中完成 testability/code-quality 分析。本次为 首次从 SOLID + DRY + coupling 角度的系统性分析,发现 6 个独立于已有 issue 的架构问题。Scope: solid, dry, coupling — cycle 2 (模块分析通过 1)
Key files:
manager.go,pool.go,store.go,key.goFinding Summary
Findings
SOLID: SRP
manager-mixed-responsibilities
Severity: High | Confidence: High | ROI: High
Location:
manager.go(1,025 lines),Managerstruct (lines 36-52)Problem:
Manager承担 5+ 职责:(1) session CRUD + 缓存, (2) 5 状态机转换, (3) worker 生命周期管理 (attach/detach/terminate), (4) 后台 GC (zombie/max_lifetime/idle), (5) metrics 发射。其中 GC 逻辑(runGC+gc+ResetGCInterval, lines 841-969)共 128 行,是独立的定时任务调度,完全可以从 Manager 中提取。transitionState(73 行) 混合了 6 个关注点:内存状态更新、idle 过期设置、DB 持久化 + rollback、worker 终止、metrics 更新、通知回调。尤其是 worker 终止块(lines 269-294)在 session 层直接调用worker.Terminate()+metrics.WorkerExecDuration,违反了层次边界。Current Pattern:
Proposed Fix: 分阶段提取:
Estimated Impact: manager.go 减少 ~150 行,GC 和 worker lifecycle 可独立测试
Acceptance Criteria:
SessionGC独立 struct,Manager 持有*SessionGC引用transitionState≤ 40 行,只处理状态 + 持久化recordTransitionhelper 统一make test零回归SOLID: DIP
concrete-poolmanager-dependency
Severity: High | Confidence: High | ROI: Medium
Location:
manager.go:44(pool *PoolManager),pool.go(no interface)Problem:
Manager依赖具体*PoolManager而非接口。这导致:Current Pattern:
Proposed Fix: 定义
Pool接口并在 Manager 中使用:Estimated Impact: Manager 可使用 mock Pool 测试,解锁替代 quota 实现
Acceptance Criteria:
Pool接口定义在pool.go或interfaces.goManager.pool字段类型改为Pool接口PoolManager实现Pool接口DRY
metrics-emission-pattern-repeated
Severity: Medium | Confidence: High | ROI: High
Location:
manager.go:transitionState(lines 289-294),AttachWorker(lines 447-449),detachWorkerUnchecked(lines 481-486),Delete(lines 556-558),gc(lines 935-937)Problem:
metrics.SessionsActive.WithLabelValues(string(state)).Dec()/.Inc()模式在 5+ 个位置重复,每个位置还伴随metrics.WorkersRunning和metrics.WorkerExecDuration的更新。新增状态或修改 metrics 标签需要修改所有位置。Current Pattern:
Proposed Fix:
Estimated Impact: ~25 行减少,metrics 变更只改 helper
Acceptance Criteria:
metrics.SessionsActive调用集中到 helper 函数metrics.WorkersRunning+WorkerExecDuration调用集中到 helperlock-boilerplate-repeated-fifteen-times
Severity: Medium | Confidence: High | ROI: Medium
Location:
manager.go—Get,Transition,TransitionWithInput,AttachWorker,GetWorker,detachWorkerUnchecked,Delete,DeletePhysical,ClearContext,UpdateWorkerSessionID,DebugSnapshot,ListActive,WorkerHealthStatuses,TerminateAllWorkers,Lock,UnlockProblem: "Manager.mu.RLock → lookup sessions map → managedSession.mu.RLock/Lock → operate → unlock" 样板代码重复约 15 次。
getManagedSessionhelper 已集中了 lookup + cache-miss 逻辑,但锁获取/释放的 ordering boilerplate 在每个方法中仍重复。Current Pattern:
Proposed Fix: 提供
withSession辅助方法统一锁获取:Estimated Impact: ~60 行锁样板减少,统一锁 ordering 保证
Acceptance Criteria:
withSessionhelper 方法覆盖 read/write 锁获取withSession替代手动锁go test -race ./internal/session/...通过Coupling
nil-guard-pattern-repeated-eight-times
Severity: Medium | Confidence: Medium | ROI: Medium
Location:
manager.go:329, 354, 410, 460, 494, 600, 628, 652Problem: 8 个公共方法以
if m == nil { return ErrSessionNotFound }开头。这种模式表明 Manager 可能在初始化完成前被调用,或作为可选参数传递。nil-guard 本身是防御性编程,但重复 8 次意味着调用约定不够清晰。Current Pattern:
Proposed Fix: 选项 A — 确保 Manager 在所有调用路径上都已初始化,移除 nil-guard。选项 B — 提供
SafeManagerwrapper:Estimated Impact: 消除 8 处重复防御性检查,明确调用约定
Acceptance Criteria:
Implementation Priority
Recommended starting point: metrics-emission-pattern-repeated — 最小改动,高回报,消除 5+ 处重复
Out of Scope
getManagedSessioncontext.Background() — 已在 issue 243 跟踪DerivePlatformSessionKeyswitch on platform type — Low ROI, 小函数Verification
make test通过,无回归make lint不产生新警告go test -race ./internal/session/...通过