Skip to content

Bug Report: System Freeze with Kernel 6.8.0-107 and Commit 07fa2bf #24

@EmilioMezaE

Description

@EmilioMezaE

Summary

The intrepid-socketcan-kernel-module causes complete system freezes when used with kernel 6.8.0-107-generic due to incorrect socket buffer handling introduced in commit 07fa2bf ("Fix skb leak").

Environment

  • Kernel: 6.8.0-107-generic (Ubuntu 22.04 HWE)
  • Module: intrepid-socketcan-kernel-module commit 07fa2bf
  • Device: ValueCAN
  • Python: 3.10
  • python-can: 4.6.1
  • iso_tp: 1.23.0

Impact

  • Severity: CRITICAL - Complete system freeze requiring hard reboot
  • Scope: Affects multiple machines
  • Trigger: Any ISO-TP multi-frame communication over socketcan
  • Timeline: Issue appeared after kernel upgrade to 6.8.0-107 (April 10, 2026)

Symptoms

  1. Running UDS commands over socketcan causes immediate system freeze
  2. Basic CAN operations work fine (candump can0)
  3. Same commands work with NeoVI interface
  4. No kernel panic messages - system completely unresponsive

Root Cause

The Buggy Commit

Commit: 07fa2bf573754f97e1b21827ed90eb83085921fd
Date: Tue Mar 24 14:02:24 2026
Author: Kyle Schwarz kschwarz@intrepidcs.com
Message: "Fix skb leak"

The Bug

The commit changed socket buffer freeing logic in intrepid_CAN_netdevice_xmit() and intrepid_ETH_netdevice_xmit():

BEFORE (correct):

exit:
    if (ret == NETDEV_TX_OK && !consumed)
        consume_skb(skb);
    wake_up_interruptible(&tx_wait);

AFTER (incorrect):

exit:
    dev_kfree_skb(skb);  // ❌ ALWAYS frees, even when tx fails!
    wake_up_interruptible(&tx_wait);

Why This Causes a Freeze

  1. When transmit buffer is full, the driver returns NETDEV_TX_BUSY
  2. The kernel expects the skb to remain valid for retry
  3. The buggy code calls dev_kfree_skb() unconditionally
  4. This creates a use-after-free or double-free condition
  5. Kernel 6.8.0-107 has stricter memory debugging → detects corruption → freeze

Technical Details

The consumed flag was tracking whether the skb contents were copied into the driver's internal buffer. When consumed == false, the skb should not be freed. The new code removes this check entirely.

Relevant kernel documentation (Documentation/networking/netdevices.txt):

ndo_start_xmit:
  ...
  If the driver returns NETDEV_TX_BUSY, it must not free the skb.
  The network stack will call it again to retry.

Reproduction Steps

  1. Install kernel 6.8.0-107-generic
  2. Clone and build intrepid-socketcan-kernel-module at commit 07fa2bf
  3. Load the module: sudo insmod intrepid.ko
  4. Start icsscand daemon
  5. Configure interface: sudo ip link set can0 up
  6. Run a UDS sequest using Socketcan
  7. Result: System freezes completely

The Fix

Revert commit 07fa2bf:

cd intrepid-socketcan-kernel-module
git revert --no-commit 07fa2bf
make clean && make
sudo rmmod intrepid
sudo insmod intrepid.ko

Test result: System no longer freezes. Commands complete normally.

Proper Fix Needed

The original "skb leak" that commit 07fa2bf was trying to fix needs to be addressed differently. Suggested approach:

exit:
    if (ret == NETDEV_TX_OK) {
        // Success - we own the skb, must free it
        dev_kfree_skb(skb);
    } else if (ret == NETDEV_TX_BUSY) {
        // Busy - kernel will retry, DON'T free
        // skb will be freed on retry
    } else {
        // Other error - free the skb
        dev_kfree_skb_any(skb);
    }
    wake_up_interruptible(&tx_wait);

Or simpler:

exit:
    if (ret != NETDEV_TX_BUSY) {
        dev_kfree_skb_any(skb);
    }
    wake_up_interruptible(&tx_wait);

Additional Notes

  • The bug was latent until kernel 6.8.0-107 was released
  • Older kernels may have been more permissive with memory errors
  • The freeze occurs specifically during ISO-TP flow control handling
  • User-space ISO-TP (iso_tp package) doesn't trigger the bug since it doesn't hit this code path the same way

References

Recommendations

  1. ✅ Immediate: Revert commit 07fa2bf
  2. ⏳ Short-term: Fix the actual skb leak properly
  3. ⏳ Long-term: Add automated testing with different kernel versions
  4. ⏳ Long-term: Add buffer overflow/underflow checks

Date: April 10, 2026

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions