|
| 1 | +# Unworked Review Issues |
| 2 | + |
| 3 | +**Run:** 2026-05-10 22:45:00 |
| 4 | +**Task:** TASK-027 |
| 5 | +**Total:** 10 (0 critical, 6 major, 4 minor) |
| 6 | + |
| 7 | +## Major |
| 8 | + |
| 9 | +1. [ ] **security-reviewer** | `src/webserver.cpp:1541` | broken-access-control |
| 10 | + `lookup_v2()` is defined but never called from `finalize_answer()` or any other dispatch site. All live request dispatch still uses the v1 `registered_resources_str` / `registered_resources_regex` path exclusively. Resources registered through the v2 table have no effect on actual request routing until the v1 cutover happens. TASK-027 explicitly defers the dispatch cutover to a follow-up task (Cycle K) to keep this diff reviewable; the v2 table is a shadow/test table in this PR. Any access-control logic placed only in the v2 table (e.g. a catch-all deny prefix route) would silently have no effect on live traffic in the current state. CWE-284. |
| 11 | + *Recommendation:* Wire `lookup_v2()` into `finalize_answer()` as the primary dispatch mechanism in the follow-up cutover task (TASK-036 or a dedicated TASK-028 gate). Until then, document clearly in API-facing comments that routes registered via `register_path` / `on_*` / `route()` are stored in both v1 and v2 tables, but only the v1 table drives actual dispatch. |
| 12 | + |
| 13 | +2. [ ] **test-quality-reviewer** | `test/unit/lookup_pipeline_test.cpp:1` | missing-test |
| 14 | + The acceptance criterion "path-piece extraction populates `http_request`" has no end-to-end test. `lookup_v2()` returns `captured_params` in the `lookup_result`, but no test verifies that those captures are subsequently written into the `http_request` as args (equivalent to the v1 `mr->dhr->set_arg` calls at webserver.cpp:1907). Because the live dispatch still uses v1, this gap cannot be filled until `lookup_v2` is wired into the dispatch path. Once Cycle K (dispatch cutover) lands, a live-dispatch integration test must verify that `GET /users/42/posts` results in `http_request::get_arg("id") == "42"`. |
| 15 | + *Recommendation:* After `lookup_v2` is wired into dispatch, add an integration test that starts the webserver, issues a real request to `/users/42/posts`, and asserts `http_request::get_arg("id") == "42"`. Pair this with the TASK-028 routing-semantics regression gate. |
| 16 | + |
| 17 | +3. [ ] **security-reviewer** | `src/webserver.cpp:1890` | denial-of-service |
| 18 | + The v1 route cache (`route_cache_list` / `route_cache_map`) and the v2 `route_cache_v2` use different locking patterns. In `unregister_impl_()` the v1 cache is cleared inline (direct lock + list.clear() + map.clear()) while `registered_resources_mutex` (unique_lock) is already held, whereas in `register_impl_()` and `on_methods_()` the invalidation is delegated to `invalidate_route_cache()` after releasing the table lock. If a v1-path thread holds `route_cache_mutex` and tries to acquire `registered_resources_mutex` while another thread holds `registered_resources_mutex` and waits on `route_cache_mutex`, a deadlock can occur. CWE-833. Will be resolved when the v1 cache is removed in the dispatch-cutover follow-up (TASK-036). The lock-order discipline should be audited at that point. In the interim, consolidating the v1 cache clearing inside `invalidate_route_cache()` would remove the asymmetry and reduce audit surface. |
| 19 | + *Recommendation:* Track as a known lock-order risk during the v1/v2 coexistence period. Resolve by removing the v1 cache entirely when the dispatch cutover (Cycle K) ships. |
| 20 | + |
| 21 | +4. [ ] **security-reviewer** | `src/httpserver/detail/radix_tree.hpp:79` | denial-of-service |
| 22 | + `radix_node<T>` uses `std::unordered_map<std::string, ...>` for `children_`. On most libc++ / libstdc++ implementations `std::hash<std::string>` is not hash-randomized by default. An attacker who can register many route segments that hash-collide can degrade lookup from O(1) to O(n) per hash probe. Under the shared lock on `route_table_mutex_` this enables per-request CPU spikes proportional to the collision depth. CWE-400. |
| 23 | + *Recommendation:* Use `std::map` (ordered, collision-free) for `children_`, or supply a per-process randomized hash seeded via `std::random_device` at startup. Alternatively cap the number of children per node or the radix tree depth. |
| 24 | + |
| 25 | +5. [ ] **performance-reviewer** | `src/httpserver/detail/radix_tree.hpp:232` | memory-allocation |
| 26 | + `tokenize()` is called on every `find()` invocation and internally calls `http_utils::tokenize_url(std::string{path})`, which (a) copies the `string_view` into a `std::string`, (b) allocates a `std::vector<std::string>` for the segments, and (c) allocates each segment as an individual `std::string`. These heap allocations sit on the per-request critical path for every parameterised-route lookup and every cold cache miss. |
| 27 | + *Recommendation:* Tokenize inline within `find()` using `std::string_view` iteration over the path. Use transparent lookup (heterogeneous hashing, C++20) on `children_` to avoid segment string copies during descent. This eliminates the vector allocation and the per-segment heap copies. |
| 28 | + |
| 29 | +## Minor |
| 30 | + |
| 31 | +6. [ ] **performance-reviewer** | `src/webserver.cpp:1545` | memory-allocation |
| 32 | + `cache_key key{method, path}` at line 1545 copies `path` (a `const std::string&`) into a new `std::string` inside `cache_key::path` on every call to `lookup_v2`, including every warm-cache hit. For paths longer than SSO (~15 bytes) this is a heap allocation on every request. |
| 33 | + *Recommendation:* Add a `find_by_view()` overload to `route_cache` that takes `(http_method, std::string_view)` and performs the map lookup without constructing a `cache_key` by value. The key copy is only required on cache insert. |
| 34 | + |
| 35 | +7. [ ] **performance-reviewer** | `src/httpserver/detail/radix_tree.hpp:64` | memory-allocation |
| 36 | + `radix_match<T>::captures` is a `std::vector<std::pair<std::string,std::string>>`. In `find()`, `caps` is built incrementally then potentially assigned to `best_prefix_caps` (a copy at line 172) and again moved into `out.captures`. For routes with N wildcard segments this produces 2 copies of the captures vector before it lands in `lookup_result`. In `lookup_v2` the captures then flow into `cache_value::captured_params` (another copy at line 1601-1602), for 3 total copies between radix match and cache insert. |
| 37 | + *Recommendation:* Eliminate the intermediate `best_prefix_caps` copy by tracking a slice count. In `lookup_v2`, move `result.captured_params` directly into the `cache_value` struct instead of assigning by value. |
| 38 | + |
| 39 | +8. [ ] **security-reviewer** | `src/webserver.cpp:476` | insecure-design |
| 40 | + In `on_methods_()`, the `param_and_prefix_routes_.find(key, existing)` call returns a `const T* entry` (non-owning pointer into the tree). Between find and insert the caller holds `route_table_mutex_` (unique_lock, so no concurrent mutation), but `insert()` unconditionally replaces the terminus. If a prefix route and a parameterised exact route share the same path string, the merge logic may pick the wrong terminus and silently drop a registered handler. CWE-362 (semantic, not a true data race). |
| 41 | + *Recommendation:* Separate the find/insert paths by also checking `existing.entry` for the prefix_terminus case. Or expose a dedicated `update-terminus` method on `radix_tree<T>` that performs an atomic read-modify-write on the correct terminus field. |
| 42 | + |
| 43 | +9. [ ] **architecture-alignment-checker** | `src/httpserver/detail/webserver_impl.hpp:207` | pattern-violation |
| 44 | + The v2 LRU cache mutex is named `route_cache_mutex` (no trailing underscore) whereas the project's member-naming convention uses a trailing underscore for private member data (e.g. `route_table_mutex_`, `registered_resources_mutex` — though the latter also lacks one). The architecture spec names it `route_cache_mutex_` (with underscore). |
| 45 | + *Recommendation:* Rename `route_cache_mutex` to `route_cache_mutex_` throughout `webserver_impl.hpp` and `webserver.cpp` to align with the architectural spec's documented identifier and the broader member-naming convention. |
| 46 | + |
| 47 | +10. [ ] **security-reviewer** (iter2) | `src/webserver.cpp:1688` | broken-access-control (CWE-22, CWE-284) |
| 48 | + **OUT-OF-SCOPE for TASK-027.** `normalize_path()` and `should_skip_auth()` were introduced before TASK-027 (commits 86d4631 / d8b055e); TASK-027 did not modify either function. The finding is valid but predates this task. |
| 49 | + Two sub-issues are flagged: |
| 50 | + (a) `normalize_path` does not collapse consecutive slashes (`//`), so a wildcard skip_path like `/public/*` fails to match `/public//secret` after normalization, which could admit a path that should be auth-protected or block one that should be skipped. |
| 51 | + (b) The contract that the `path` argument to `should_skip_auth` must already be MHD-unescaped (by `unescaper_func`, registered via `MHD_OPTION_UNESCAPE_CALLBACK`) is not documented at the call site. If a future caller passes a raw percent-encoded path, percent-encoded dot sequences (`%2e`, `%2f`) survive `normalize_path` and the auth-skip decision may be wrong. |
| 52 | + *Recommendation:* In a follow-up task targeting the auth/skip-auth subsystem: (1) collapse consecutive `/` separators in `normalize_path` before the segment-split loop; (2) add a comment at the `should_skip_auth` call site (webserver.cpp ~line 2002) and at the function declaration documenting the MHD-unescape precondition. |
0 commit comments