Skip to content

ospfd: Implement rfc4222 Recommendation 1 and 2#20936

Open
dfedyk wants to merge 6 commits intoFRRouting:masterfrom
LabNConsulting:working/df/RFC4222-rec2-rec1
Open

ospfd: Implement rfc4222 Recommendation 1 and 2#20936
dfedyk wants to merge 6 commits intoFRRouting:masterfrom
LabNConsulting:working/df/RFC4222-rec2-rec1

Conversation

@dfedyk
Copy link
Copy Markdown

@dfedyk dfedyk commented Feb 25, 2026

Implement RFC 4222 Recommendation 1 and Recommendation 2 includes documentation and topotests for each. See documentation for more details.

@frrbot frrbot Bot added documentation ospf tests Topotests, make check, etc labels Feb 25, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Feb 25, 2026

Greptile Summary

This PR implements RFC 4222 Recommendation 1 (per-packet-type DSCP marking via ip ospf dscp all|low-control) and Recommendation 2 (dead-timer reset on any control packet via ip ospf dead-timer-reset any-control), including documentation, VTY commands, operational display, and topotests for both features.

  • P1 – flaky test: test_ospf_dead_timer queries R2's reset counter immediately after flooding 49 loopbacks on R1, with no sleep in between; the LSUs have not been processed yet and resets will be 0, causing a spurious failure.
  • P2 – default branch inconsistency: in ospf_tos_for_type, the default case reads dscp_low_control directly without the configured-check / dscp_ospf_all fallback that the DB-Desc/LS-Req/LS-Upd case applies.

Confidence Score: 4/5

Safe to merge after fixing the missing sleep in test_ospf_dead_timer; core C implementation is correct.

One P1 finding remains: the inactive-timer topotest will be flaky in CI because it checks the reset counter without sleeping after flooding routes. The C implementation itself (DSCP marking, inactivity timer restart, byte-order handling) is correct. The P2 default-branch inconsistency in ospf_tos_for_type is a minor architectural nit that will not affect currently deployed packet types.

tests/topotests/ospf_rfc4222_inactive_tmr/test_ospf_rfc4222_inactive_tmr.py — missing sleep before counter assertion

Important Files Changed

Filename Overview
ospfd/ospf_packet.c Adds ospf_maybe_restart_inactivity() (RFC4222 Rec2) and ospf_tos_for_type() (RFC4222 Rec1); default branch in ospf_tos_for_type skips the low-control configured-check fallback
ospfd/ospf_vty.c Adds DEFPY commands for DSCP and dead-timer-reset; config-write guards are now independent (idempotency fix applied); DSCP operational display and neighbor reset counter added
ospfd/ospf_interface.c Initialises new dead_timer_any, dscp_ospf_all, dscp_low_control params (UNSET + default value) and adds them to ospf_free_if_params guard
ospfd/ospf_nsm.c Adds ospf_nsm_restart_inactivity_timer() helper that cancels and re-arms the inactivity timer; straightforward
ospfd/ospf_neighbor.h Adds dead_timer_resets uint64_t counter to ospf_neighbor struct
tests/topotests/ospf_rfc4222_inactive_tmr/test_ospf_rfc4222_inactive_tmr.py Tests dead-timer reset counter; missing sleep between LSU flood and counter check makes test_ospf_dead_timer flaky
tests/topotests/ospf_rfc4222_dscp/test_ospf_rfc4222_dscp.py Tests on-wire DSCP markings and operational display; ack_ok/hello_ok/low_ctrl_ok initialised correctly; duplicate pytest import is harmless

Comments Outside Diff (1)

  1. tests/topotests/ospf_rfc4222_inactive_tmr/test_ospf_rfc4222_inactive_tmr.py, line 1354-1367 (link)

    P1 No sleep before querying the reset counter — test will be flaky

    Routes are added to R1 and then R2's neighbor-detail output is captured immediately. OSPF must still detect the new prefixes, generate LSAs, transmit LSUs over the link, and have R2 process them before dead_timer_resets > 0 can be true. Without any sleep the assertion will race and the test will sporadically fail in CI.

    Add a sleep after the loopback additions, similar to how the DSCP test uses topotest.sleep(3, "Gathering Packets") before reading results:

    topotest.sleep(3, "Wait for LSU flood")
    out = r2.vtysh_cmd("show ip ospf neighbor detail")
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: tests/topotests/ospf_rfc4222_inactive_tmr/test_ospf_rfc4222_inactive_tmr.py
    Line: 1354-1367
    
    Comment:
    **No sleep before querying the reset counter — test will be flaky**
    
    Routes are added to R1 and then R2's neighbor-detail output is captured immediately. OSPF must still detect the new prefixes, generate LSAs, transmit LSUs over the link, and have R2 process them before `dead_timer_resets > 0` can be true. Without any sleep the assertion will race and the test will sporadically fail in CI.
    
    Add a sleep after the loopback additions, similar to how the DSCP test uses `topotest.sleep(3, "Gathering Packets")` before reading results:
    
    ```python
    topotest.sleep(3, "Wait for LSU flood")
    out = r2.vtysh_cmd("show ip ospf neighbor detail")
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: tests/topotests/ospf_rfc4222_inactive_tmr/test_ospf_rfc4222_inactive_tmr.py
Line: 1354-1367

Comment:
**No sleep before querying the reset counter — test will be flaky**

Routes are added to R1 and then R2's neighbor-detail output is captured immediately. OSPF must still detect the new prefixes, generate LSAs, transmit LSUs over the link, and have R2 process them before `dead_timer_resets > 0` can be true. Without any sleep the assertion will race and the test will sporadically fail in CI.

Add a sleep after the loopback additions, similar to how the DSCP test uses `topotest.sleep(3, "Gathering Packets")` before reading results:

```python
topotest.sleep(3, "Wait for LSU flood")
out = r2.vtysh_cmd("show ip ospf neighbor detail")
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: ospfd/ospf_packet.c
Line: 441-443

Comment:
**`default` branch skips the configured-check fallback**

The `OSPF_MSG_DB_DESC`/`OSPF_MSG_LS_REQ`/`OSPF_MSG_LS_UPD` case correctly falls back to `dscp_ospf_all` when `dscp_low_control` is not explicitly configured. The `default` branch reads `dscp_low_control` unconditionally, so any unknown future packet type gets the raw (possibly un-configured) low-control value instead of inheriting `all`. The same fallback guard used above should be applied here for consistency.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (5): Last reviewed commit: "doc: add OSPF DSCP control packet markin..." | Re-trigger Greptile

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment thread ospfd/ospf_vty.c Outdated
Comment thread ospfd/ospf_packet.c Outdated
@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch 4 times, most recently from 85dbd8b to 9130018 Compare February 27, 2026 19:18
@riw777 riw777 self-requested a review March 3, 2026 15:54
Copy link
Copy Markdown
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good ... it doesn't look like the lint errors are in the changed code (?)

/* 
<stdin>:359: trailing whitespace.
/* 
<stdin>:371: trailing whitespace.
/* 
<stdin>:383: trailing whitespace.
/* 
warning: 4 lines add whitespace errors.
Report for ospf_packet.c | 10 issues
===============================================
< WARNING: Block comments use * on subsequent lines
< #422: FILE: /tmp/f1-1891220/ospf_packet.c:422:
< ERROR: trailing whitespace
< #1301: FILE: /tmp/f1-1891220/ospf_packet.c:1301:
< ERROR: trailing whitespace
< #1552: FILE: /tmp/f1-1891220/ospf_packet.c:1552:
< ERROR: trailing whitespace
< #1800: FILE: /tmp/f1-1891220/ospf_packet.c:1800:
< ERROR: trailing whitespace
< #2194: FILE: /tmp/f1-1891220/ospf_packet.c:2194:\nDetails at [https://ci1.netdef.org/browse/FRR-PULLREQ3-CHECKOUT-14881](https://ci1.netdef.org/browse/FRR-PULLREQ3-CHECKOUT-14881)```

@riw777
Copy link
Copy Markdown
Member

riw777 commented Mar 17, 2026

@dfedyk can you fix the lint stuff here?

@github-actions github-actions Bot added the rebase PR needs rebase label Mar 26, 2026
@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from 9429ebf to 9fc5690 Compare March 27, 2026 21:13
@dfedyk
Copy link
Copy Markdown
Author

dfedyk commented Mar 27, 2026

Sorry for delay I had some vacation and STDs meetings back now.

@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from 9fc5690 to fab2a7b Compare March 30, 2026 13:59
@riw777
Copy link
Copy Markdown
Member

riw777 commented Mar 31, 2026

FYI, the linter looks wrong on the one remaining suggested change ...

@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from fab2a7b to a2afe17 Compare March 31, 2026 19:44
@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented Apr 6, 2026

@greptile review

@Jafaral Jafaral self-requested a review April 6, 2026 18:24
Comment thread tests/topotests/ospf_rfc4222_dscp/test_ospf_rfc4222_dscp.py
Comment thread ospfd/ospf_vty.c Outdated
@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch 2 times, most recently from ac108f2 to 3d4b8e9 Compare April 8, 2026 15:12
@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch 2 times, most recently from 1ad96ce to 6cfe883 Compare April 15, 2026 21:31
@github-actions github-actions Bot added size/XXL and removed size/XL labels Apr 15, 2026
@frrbot frrbot Bot added the babel label Apr 15, 2026
@github-actions
Copy link
Copy Markdown

This pull request has conflicts, please resolve those before we can evaluate the pull request.

RFC 4222 Recommendation 2: When packet prioritization
isn’t feasible, refresh the OSPF neighbor inactivity
timer upon receipt of any valid OSPF unicast packet, or
any OSPF packet sent to AllSPFRouters on a
point-to-point link (not just Hellos), to avoid
unnecessary adjacency drops under congestion. This
extends the adjacency on low-speed links where Hellos
might be delayed or blocked by other control traffic.

Signed-off-by: Don Fedyk <dfedyk@labn.net>
@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from 6cfe883 to e3695c0 Compare April 15, 2026 22:04
@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented Apr 20, 2026

@greptile review

@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from e3695c0 to 8599a3e Compare April 20, 2026 21:35
Copy link
Copy Markdown
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, waiting on @Jafaral 's review

@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented Apr 21, 2026

@greptile review

Comment thread ospfd/ospf_packet.c
Comment on lines +419 to +444
static inline uint8_t ospf_tos_for_type(struct ospf_interface *oi, uint8_t type)
{
uint8_t dscp;

switch (type) {
case OSPF_MSG_HELLO:
case OSPF_MSG_LS_ACK:
/* Always highest priority */
dscp = OSPF_IF_PARAM(oi, dscp_ospf_all);
break;

case OSPF_MSG_DB_DESC:
case OSPF_MSG_LS_REQ:
case OSPF_MSG_LS_UPD:
/* Bulk control traffic: low control */
dscp = OSPF_IF_PARAM(oi, dscp_low_control);
break;

default:
dscp = OSPF_IF_PARAM(oi, dscp_low_control);
break;
}

dscp &= 0x3f; /* clamp to 6 bits */
return (uint8_t)(dscp << 2); /* DSCP in bits 7..2, ECN = 0 */
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 dscp_ospf_all never applied to bulk traffic despite documentation claiming it does

The command help text for ip ospf dscp all (line 3399 in ospf_vty.c) explicitly states: "DSCP used for all OSPF control packets (Hellos, Acks, and others when low-control not set)". However, ospf_tos_for_type hardwires DB-Desc, LS-Req, and LS-Upd to always read dscp_low_control, regardless of whether dscp_low_control was explicitly configured.

With the defaults both at IPTOS_PREC_INTERNETCONTROL >> 2 (CS6 = 48), an operator who runs only ip ospf dscp all 46 will get Hello/Ack at DSCP 46 while bulk traffic continues to use DSCP 48 — a higher precedence class, which inverts the QoS priority the RFC is designed to achieve. The bulk-traffic path should fall back to dscp_ospf_all when dscp_low_control is not explicitly configured:

case OSPF_MSG_DB_DESC:
case OSPF_MSG_LS_REQ:
case OSPF_MSG_LS_UPD:
default:
    /* Fall back to dscp_ospf_all when low-control is not explicitly set */
    if (OSPF_IF_PARAM_CONFIGURED(oi->params, dscp_low_control) ||
        OSPF_IF_PARAM_CONFIGURED(IF_DEF_PARAMS(oi->ifp), dscp_low_control))
        dscp = OSPF_IF_PARAM(oi, dscp_low_control);
    else
        dscp = OSPF_IF_PARAM(oi, dscp_ospf_all);
    break;

The existing test_ospf_dscp_all_and_low_control test always configures both parameters together (all=46 and low-control=40), so it does not exercise or catch this gap.

Prompt To Fix With AI
This is a comment left during a code review.
Path: ospfd/ospf_packet.c
Line: 419-444

Comment:
**`dscp_ospf_all` never applied to bulk traffic despite documentation claiming it does**

The command help text for `ip ospf dscp all` (line 3399 in `ospf_vty.c`) explicitly states: *"DSCP used for all OSPF control packets (Hellos, Acks, **and others when low-control not set**)"*. However, `ospf_tos_for_type` hardwires DB-Desc, LS-Req, and LS-Upd to always read `dscp_low_control`, regardless of whether `dscp_low_control` was explicitly configured.

With the defaults both at `IPTOS_PREC_INTERNETCONTROL >> 2` (CS6 = 48), an operator who runs only `ip ospf dscp all 46` will get Hello/Ack at DSCP 46 while bulk traffic continues to use DSCP 48 — a **higher** precedence class, which inverts the QoS priority the RFC is designed to achieve. The bulk-traffic path should fall back to `dscp_ospf_all` when `dscp_low_control` is not explicitly configured:

```c
case OSPF_MSG_DB_DESC:
case OSPF_MSG_LS_REQ:
case OSPF_MSG_LS_UPD:
default:
    /* Fall back to dscp_ospf_all when low-control is not explicitly set */
    if (OSPF_IF_PARAM_CONFIGURED(oi->params, dscp_low_control) ||
        OSPF_IF_PARAM_CONFIGURED(IF_DEF_PARAMS(oi->ifp), dscp_low_control))
        dscp = OSPF_IF_PARAM(oi, dscp_low_control);
    else
        dscp = OSPF_IF_PARAM(oi, dscp_ospf_all);
    break;
```

The existing `test_ospf_dscp_all_and_low_control` test always configures **both** parameters together (all=46 *and* low-control=40), so it does not exercise or catch this gap.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently design is intent- do not couple value to precedence meaning since a higher DSCP value does not always mean a higher precedence. This is basically a user beware configuration and you should know what you are doing. If we think the documentation needs to be more explicit then I would rather update the documentation than make a change where the system forces a value which would not allow flexibility through all values. If we implement this it would prevent users setting to some DSCP values which could have local meaning. There are DSCP values are available for local/experimental use for example (e.g., 1–7, 9, 11, 13, 15, 17, .).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greptile, look at the reply above

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point — you're right that coupling the fallback logic to a precedence assumption would be wrong. DSCP is just a 6-bit marking; its meaning is locally defined, and forcing dscp_ospf_all as a fallback would break operators using local/experimental codepoints for bulk traffic.

The documentation wording "and others when low-control not set" is the real issue here. If the help text were updated to say something like "DSCP used for Hello and LS-Ack packets" (without implying a fallback), the behavior and docs would be consistent and no code change is needed.

Tip: You can give me more context with [.greptile/rules](https://www.greptile.com/docs/code-review/custom-standards) so I get this right next time.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will review and update the documentation.

Copy link
Copy Markdown
Author

@dfedyk dfedyk Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have found a side effect and I am updating the operational display to clear up the confusion.

@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from 8599a3e to 02fcc94 Compare April 22, 2026 03:43
@github-actions github-actions Bot added size/XXL and removed size/XL labels Apr 22, 2026
@Jafaral
Copy link
Copy Markdown
Member

Jafaral commented Apr 23, 2026

@greptile review

dfedyk added 5 commits April 23, 2026 16:32
Topotest for RFC 4222 Rec 2 any ospf control
resets dead interval

Signed-off-by: Don Fedyk <dfedyk@labn.net>
Add user documentation for OSPF interface:

  ip ospf dead-timer-reset any-control

This command implements RFC 4222 Recommendation 2 dead timer
reset behavior.

Signed-off-by: Don Fedyk <dfedyk@labn.net>
Implement RFC 4222 Rec 1: classify OSPF packets into
high-priority (Hellos, LSA Acks) and low-priority classes,
and prioritize high-priority processing to maintain critical
control packet handling during congestion. All control
packet can have a DSCP value set, or explicitly low control
can have a different DSCP value.

Signed-off-by: Don Fedyk <dfedyk@labn.net>
This test sets up two nodes and changes the DSCP
then checks the resulting message dscp.

Signed-off-by: Don Fedyk <dfedyk@labn.net>
Add the ip ospf dscp (all|low-control) (0-63) interface command to the
OSPF user documentation. This command allows operators to set DSCP
values for OSPF control packets to support QoS prioritization.

Signed-off-by: Don Fedyk <dfedyk@labn.net>
@dfedyk dfedyk force-pushed the working/df/RFC4222-rec2-rec1 branch from 02fcc94 to d184f17 Compare April 23, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants