sched: add SCHED_MIC, a hybrid-core-aware scheduler derived from ULE by laffer1 · Pull Request #385 · MidnightBSD/src

laffer1 · 2026-06-12T02:28:43Z

Summary

Adds SCHED_MIC, a new scheduler derived from ULE that weighs heterogeneous
CPU core classes when placing threads. ULE remains the GENERIC default; SCHED_MIC
is opt-in (options SCHED_MIC, or the sample sys/amd64/conf/MIC config).

On x86 hybrid CPUs it prefers cores in this order:

P-cores / AMD 3D V-Cache CCD cores (physical)
E-cores / AMD compute-CCD cores / AMD mobile "C" cores (physical)
the second SMT thread of a busy core ("hyperthreaded")
Intel LP-E cores (last)

The preference is a soft, tunable bias folded into the existing
cpu_search_lowest() load comparison, applied only on the placement path
(sched_pickcpu). The long-term balancer and work stealing stay class-blind, so
real-load balancing is preserved. On homogeneous hardware and non-x86 arches it
behaves like ULE (byte-identical with kern.sched.smt_busy_penalty=0).

Detection

Per-CPU classification runs at AP startup via smp_rendezvous in an SI_SUB_SMP
SYSINIT (#ifdef SCHED_MIC in mp_x86.c), stored in a new cpu_core_class[]
array that defaults to "performance" (so anything unrecognized behaves like ULE):

Case	Method	Reliability
Intel P vs E	CPUID `0x1A` core type	architectural
Intel LP-E	small core with no L3, gated by `kern.sched.detect_lpe`	heuristic
AMD X3D / C-cores	larger per-CCD L3 (`0x8000001D`) preferred; symmetric/single-CCD → all perf	heuristic

Tunables (`kern.sched.*`)

class_weight_eff (64), class_weight_lp (512), smt_busy_penalty (192, 0 =
stock ULE), prefer_compute (swap AMD cache/compute CCD), detect_lpe, and a
read-only core_class dump.

Testing

Both sched_mic.c and mp_x86.c build clean under -Werror; a MINIMAL-based
SCHED_MIC kernel links into a complete kernel ELF with all symbols resolved.
Intel detection validated against real Alder Lake silicon (Core i7-1260P): 8
P-core threads → perf, 8 E-cores → eff, no LP-E (E-cores keep L3, so the LP
heuristic does not misfire).
Not yet exercised on hardware: LP-E (needs Meteor Lake / Core Ultra), AMD X3D,
AMD C-cores, and live placement behavior (needs a booted MIC kernel).

Notes / open questions

The long-term balancer is class-blind by default, so under sustained saturation
it spreads work off the preferred cores to equalize load, softening the
placement preference. Flipping one sched_lowest() argument in
sched_balance_group makes the balancer reinforce packing instead — a
hardware-dependent tradeoff. See scheduler.md for details.
scheduler.md is included at the repo root as a design doc; drop it if you'd
rather it not live in the tree.

🤖 Generated with Claude Code

Summary by Sourcery

Introduce a new optional hybrid-core-aware scheduler (SCHED_MIC) derived from ULE and integrate x86 core classification, tunables, and documentation to prefer performance cores while remaining compatible with existing behavior.

New Features:

Add the SCHED_MIC scheduler as an alternative to ULE with hybrid-core-aware thread placement.
Classify x86 CPUs into performance, efficiency, and low-power classes using CPUID-based detection for use by the scheduler.
Expose new sysctls and a sample amd64 kernel configuration to enable and tune SCHED_MIC behavior.
Add a top-level scheduler.md design document describing SCHED_MIC behavior and configuration.

Enhancements:

Extend x86 CPUID cache enumeration helpers and SMP topology structures to support hybrid core classification without changing default schedulers.

SCHED_MIC is a copy of the ULE scheduler that weighs heterogeneous CPU core classes when placing threads. On x86 CPUs with hybrid topologies it prefers, in order: P-cores / AMD 3D V-Cache CCD cores, then E-cores / AMD compute-CCD / AMD mobile "C" cores, then the second SMT thread of a busy core, then Intel LP-E cores last. The preference is a soft, tunable bias folded into the existing cpu_search_lowest() load comparison, applied only on the placement path (sched_pickcpu); the long-term balancer and work stealing stay class-blind. On homogeneous hardware, and on non-x86 architectures, SCHED_MIC behaves like ULE. ULE remains the default; SCHED_MIC is opt-in (options SCHED_MIC, or the sample sys/amd64/conf/MIC config). Core class is detected per-CPU at AP startup via an smp_rendezvous in an SI_SUB_SMP SYSINIT (#ifdef SCHED_MIC in mp_x86.c) and stored in a new cpu_core_class[] array, defaulting to "performance" so unrecognized and non-hybrid CPUs behave like ULE: - Intel P vs E: CPUID 0x1A core type (architectural). - Intel LP-E: small core with no L3 (heuristic, kern.sched.detect_lpe). - AMD X3D / Cx: larger per-CCD L3 via CPUID 0x8000001D is preferred (heuristic); symmetric/single-CCD parts stay all-perf. Tunables under kern.sched.*: class_weight_eff, class_weight_lp, smt_busy_penalty (0 = stock ULE SMT behavior), prefer_compute (swap AMD cache/compute CCD preference), detect_lpe, and a read-only core_class dump. The Intel detection path was validated against real Alder Lake silicon (Core i7-1260P): 8 P-core threads classed perf, 8 E-cores classed eff, no LP-E (E-cores retain L3, so the LP heuristic does not misfire). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

sourcery-ai · 2026-06-12T02:28:50Z

Reviewer's Guide

Introduces SCHED_MIC, a new scheduler forked from ULE that biases thread placement based on per-CPU hybrid core classes, with x86-specific CPUID-based detection and new sysctls, while leaving ULE behavior unchanged on homogeneous or non-x86 systems. The main implementation adds SCHED_MIC’s core-class-aware placement logic to the ULE codepath in a new sched_mic.c file, wires in per-CPU core classification via mp_x86.c and smp.h, and exposes tunables and debug output via new kern.sched.* sysctls and a scheduler.md design doc.

Flow diagram for SCHED_MIC core classification and placement

flowchart TD
  Boot["AP startup (SI_SUB_SMP)"] --> MicClassify[mic_classify]
  MicClassify --> Rendezvous[smp_rendezvous]
  Rendezvous --> MicProbe[mic_probe_cpu]
  MicProbe --> CpuCoreClass["populate cpu_core_class per CPU (Intel/AMD heuristics)"]

  subgraph Runtime_Placement
    SchedAdd[sched_add] --> SchedPickcpu[sched_pickcpu]
    SchedPickcpu --> SchedLowest["sched_lowest (class_aware=1)"]
    SchedLowest --> CpuSearch[cpu_search_lowest]
    CpuSearch --> ClassCost[sched_class_cost]
    ClassCost -->|uses| CoreClassArray[cpu_core_class]
    CpuSearch --> ChosenCpu["return target CPU id"]
  end

  CpuCoreClass -->|read by| ClassCost

File-Level Changes

Change	Details	Files
Add SCHED_MIC scheduler as a ULE-derived, hybrid-core-aware scheduler implementation.	Introduce new sys/kern/sched_mic.c by copying sched_ule.c and injecting hybrid core class awareness into cpu_search_lowest()/sched_pickcpu via sched_class_cost and related helpers. Add hybrid-core placement tunables (class_weight_eff, class_weight_lp, smt_busy_penalty, prefer_compute, detect_lpe, core_class) under kern.sched.*, and wire SCHED_MIC into sys/conf/options, files, NOTES, and a sample amd64 MIC kernel config. Keep long-term load balancer and work stealing class-blind while making only the placement path (sched_pickcpu) class-aware, ensuring behavior matches ULE on non-hybrid systems when smt_busy_penalty=0.	`sys/kern/sched_mic.c` `sys/conf/options` `sys/conf/files` `sys/conf/NOTES` `sys/amd64/conf/MIC`
Classify x86 CPUs into hybrid core classes for SCHED_MIC using CPUID and cache enumeration.	Add cpu_core_class[] array and CPU_CLASS_* constants to describe performance, efficiency, and LP core classes, defaulting all CPUs to performance class for non-hybrid behavior. Implement mic_probe_cpu and mic_classify SYSINIT in mp_x86.c (under #ifdef SCHED_MIC) to probe per-CPU hybrid core type and L3 size via CPUID 0x1A and deterministic cache leaves, then derive per-CPU classes for Intel hybrid (P/E/LP-E) and AMD X3D / compute CCD / C-cores. Extend specialreg.h with CPUID_CACHE_* helper macros and CPUID_HYBRID_NATIVE_MODEL_MASK to parse cache descriptors and hybrid model IDs used by the classifier.	`sys/sys/smp.h` `sys/x86/x86/mp_x86.c` `sys/x86/include/specialreg.h`
Document SCHED_MIC’s design and configuration and provide a dedicated kernel config.	Add scheduler.md at the repository root describing SCHED_MIC’s goals, core classes, placement algorithm, detection heuristics, and tunables for reviewers and future maintainers. Introduce a new amd64 kernel config MIC that enables options SCHED_MIC and any required settings to build a MIC kernel. Clarify that ULE remains the default scheduler and SCHED_MIC is opt-in, preserving existing behavior by default.	`scheduler.md` `sys/amd64/conf/MIC`

Possibly linked issues

#0: They implement the requested hybrid-core-aware ULE-derived scheduler, including Intel/AMD core classification and tunable core-class placement preferences.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

gemini-code-assist

Code Review

This pull request introduces SCHED_MIC, a hybrid-core-aware scheduler derived from ULE that optimizes thread placement on x86 CPUs with heterogeneous cores by weighing four core priority classes. Feedback on the implementation highlights a critical issue where the global cpu_core_class array is not explicitly initialized, potentially leaving CPUs with an invalid default class of 0 instead of CPU_CLASS_PERF (1) during early returns or AMD core classification. It is recommended to initialize the array to CPU_CLASS_PERF at the start of mic_classify.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-12T02:30:40Z

+static void
+mic_classify(void *arg __unused)
+{
+	uint64_t maxl3;
+	int i;
+	bool amd, intel, asymmetric;
+
+	if (mp_ncpus <= 1)
+		return;			/* defaults: every CPU is class perf. */


The cpu_core_class array is defined globally but is not explicitly initialized to CPU_CLASS_PERF (1) before classification. Since global arrays in C default to 0 (which is not a valid class constant, as CPU_CLASS_PERF is 1), any early returns in mic_classify (e.g., if mp_ncpus <= 1, if the vendor is neither Intel nor AMD, or if the CPU has symmetric L3) will leave the CPUs with a class of 0. Additionally, the AMD classification path only sets non-preferred cores to CPU_CLASS_EFF, leaving the preferred cores at their default value (which would remain 0 instead of CPU_CLASS_PERF).

To ensure correct and robust default behavior across all platforms and vendors, explicitly initialize the entire cpu_core_class array to CPU_CLASS_PERF at the beginning of mic_classify.

static void mic_classify(void *arg __unused) { uint64_t maxl3; int i; bool amd, intel, asymmetric; for (i = 0; i < MAXCPU; i++) cpu_core_class[i] = CPU_CLASS_PERF; if (mp_ncpus <= 1) return; /* defaults: every CPU is class perf. */

Thanks, but cpu_core_class is not default-zero — it's statically initialized to CPU_CLASS_PERF for every element at its definition in sys/kern/sched_mic.c:

uint8_t cpu_core_class[MAXCPU] = { [0 ... MAXCPU - 1] = CPU_CLASS_PERF };

(This file holds the definition; mp_x86.c only sees the extern, which is likely why it looked uninitialized here.)

Because of that initializer, every path is correct without any class-0:

Early returns (mp_ncpus <= 1, non-Intel/AMD vendor, no L3, symmetric L3) leave all CPUs at CPU_CLASS_PERF — that's the intended "behave like ULE" default, and the in-function comments say so.

The AMD path only needs to demote the smaller-L3 die to CPU_CLASS_EFF; the preferred die stays CPU_CLASS_PERF from the initializer by design.

So there's no uninitialized-class bug here; an explicit re-init in mic_classify() would be redundant with the static initializer.

github-actions · 2026-06-12T02:32:04Z

Claude finished @laffer1's task —— View job

Code Review for SCHED_MIC Scheduler

Reviewing PR #385 for correctness, memory safety, bounds checking, style(9) compliance, and security issues.

Review Summary

✅ Overall Assessment: High Quality Implementation

This is a well-structured scheduler implementation with proper defensive programming practices. The code follows MidnightBSD coding standards and shows careful attention to bounds checking and initialization.

Key Findings

✅ Initialization Issue Resolution

The Gemini review incorrectly flagged an initialization issue. The cpu_core_class array IS properly initialized in sys/kern/sched_mic.c:316:

uint8_t cpu_core_class[MAXCPU] = { [0 ... MAXCPU - 1] = CPU_CLASS_PERF };

This means all CPUs default to CPU_CLASS_PERF (1) as intended, not 0.

✅ Bounds Checking

Excellent bounds checking throughout:

All array accesses use MAXCPU-bounded indices via CPU_FOREACH(i) macro
Static arrays properly sized: cpu_core_class[MAXCPU], mic_hybrid[MAXCPU], etc.
No buffer overflow risks identified

✅ Memory Management

No malloc/free operations requiring symmetry analysis
Uses static global arrays and stack variables appropriately
No unsafe string operations (strcpy, sprintf, etc.)

✅ Style(9) Compliance

Proper 8-space tab indentation throughout
Reasonable line lengths (inspected samples under 80 characters)
Consistent BSD kernel coding style

✅ Security Assessment

No obvious security vulnerabilities
Proper input validation in sysctls
No user-controllable array indexing beyond validated bounds
Safe CPUID usage with proper vendor checks

Technical Details

Architecture

The implementation correctly:

Derives from ULE with minimal changes focused on hybrid core awareness
Preserves ULE behavior on homogeneous systems (smt_busy_penalty=0)
Uses proper #ifdef SCHED_MIC guards in mp_x86.c

Core Classification Logic

sys/x86/x86/mp_x86.c:1032-1079 implements sound detection:

Intel: Uses architectural CPUID 0x1A for P/E core detection
AMD: Heuristic based on L3 cache size differences for X3D/compute CCDs
Safe fallbacks to performance class for unknown configurations

Scheduler Integration

sys/kern/sched_mic.c:714-743 adds class cost calculation:

Lightweight addition to existing ULE cpu_search_lowest() path
Tunable bias weights via sysctls
Maintains load balancing properties of ULE

Recommendations

✅ Approved for Merge

This implementation demonstrates:

Careful attention to defensive programming
Proper BSD kernel coding standards
Safe handling of heterogeneous CPU detection
Minimal risk surface area

No critical issues identified. The scheduler should provide meaningful improvements on hybrid x86 systems while maintaining ULE compatibility.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

sched_mic.c is almost a full copy of sched_ule.c with a few hybrid-specific changes; consider factoring the shared logic into common helpers or a shared base implementation to avoid divergence and make future scheduler bugfixes/features easier to apply consistently.
sched_class_cost() calls smt_sibling_busy() inside the innermost cpu_search_lowest() loop, which re-walks the cg mask for every candidate CPU and can become O(n^2) per search on larger SMT groups; consider caching SMT busy state per group or per CPU (e.g., via precomputed sibling indices or a transient bitmap) to reduce this overhead.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- sched_mic.c is almost a full copy of sched_ule.c with a few hybrid-specific changes; consider factoring the shared logic into common helpers or a shared base implementation to avoid divergence and make future scheduler bugfixes/features easier to apply consistently.
- sched_class_cost() calls smt_sibling_busy() inside the innermost cpu_search_lowest() loop, which re-walks the cg mask for every candidate CPU and can become O(n^2) per search on larger SMT groups; consider caching SMT busy state per group or per CPU (e.g., via precomputed sibling indices or a transient bitmap) to reduce this overhead.

## Individual Comments

### Comment 1
<location path="sys/kern/sched_mic.c" line_range="713" />
<code_context>
+ * The result is folded into "load" before it is summed into the per-group
+ * total, so the bias steers both subtree selection and the final CPU pick.
+ */
+static int
+sched_class_cost(const struct cpu_group *cg, int c, int l)
+{
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the per-CPU effective load computation in cpu_search_lowest() into a dedicated helper to separate cost policy from traversal logic and make the search code easier to follow.

The inner logic of `cpu_search_lowest()` is now carrying a lot of policy (preference bias, randomization, hybrid class cost, SMT tweaks) on top of the topology traversal. You can reduce the cognitive load without changing behavior by extracting the per‑CPU “effective load” calculation into a small helper, so `cpu_search_lowest()` only orchestrates traversal and compares scalars.

For example, you can refactor the leaf CPU loop like this:

```c
/* New helper: encapsulate all per-CPU load/cost heuristics. */
static inline int
cpu_effective_load(const struct cpu_group *cg, int c,
    const struct cpu_search *s, int base_load)
{
	int load, penalty;

	load = base_load * 256;

	/* Prefer cs_prefer, possibly reduced if already running there. */
	penalty = 0;
	if (c == s->cs_prefer) {
		if (__predict_false(s->cs_running))
			base_load--;		/* matches existing behavior */
		penalty = 128;
	}

	/* Hybrid/E/LP cost. */
	if (s->cs_class)
		load += sched_class_cost(cg, c, base_load);

	/* Balancing-time SMT group bias (for >1 load in threaded group). */
	if (__predict_false(s->cs_running) &&
	    (cg->cg_flags & CG_FLAG_THREAD) &&
	    base_load >= 128 && (base_load & 128) != 0)
		load += 128;

	/* Randomization. */
	load -= sched_random() % 128;

	return load - penalty;
}
```

Then the leaf part of `cpu_search_lowest()` becomes much easier to follow:

```c
for (c = cg->cg_last; c >= cg->cg_first; c--) {
	if (!CPU_ISSET(c, &cg->cg_mask))
		continue;
	tdq = TDQ_CPU(c);
	l = TDQ_LOAD(tdq);

	/* Reject this CPU early based on load/pri/mask. */
	if (l > s->cs_load ||
	    (atomic_load_char(&tdq->tdq_lowpri) <= s->cs_pri &&
	     (!s->cs_running || c != s->cs_prefer)) ||
	    !CPU_ISSET(c, s->cs_mask))
		continue;

	load = cpu_effective_load(cg, c, s, l);
	total += load;

	if (load < bload || (load == bload && load < r->csr_load)) {
		bload = load;
		r->csr_cpu = c;
		r->csr_load = load;
	}
}
```

This keeps all existing heuristics (including hybrid-class penalties and SMT behavior) but:

- Moves hybrid/SMT/randomization details into a narrow helper.
- Leaves `cpu_search_lowest()` primarily responsible for traversal, filtering, and comparing scalar costs.
- Makes it easier to reason about or adjust cost policy in one place (`cpu_effective_load` and `sched_class_cost`) without re-reading the tree search logic each time.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist Bot reviewed Jun 12, 2026

View reviewed changes

sourcery-ai Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread sys/kern/sched_mic.c

laffer1 merged commit 8856847 into master Jun 12, 2026
6 of 14 checks passed

laffer1 deleted the sched_mic branch June 15, 2026 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sched: add SCHED_MIC, a hybrid-core-aware scheduler derived from ULE#385

sched: add SCHED_MIC, a hybrid-core-aware scheduler derived from ULE#385
laffer1 merged 1 commit into
masterfrom
sched_mic

laffer1 commented Jun 12, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Jun 12, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Uh oh!

laffer1 Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

laffer1 commented Jun 12, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Detection

Tunables (kern.sched.*)

Testing

Notes / open questions

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Flow diagram for SCHED_MIC core classification and placement

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

laffer1 Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review for SCHED_MIC Scheduler

Review Summary

Key Findings

✅ Initialization Issue Resolution

✅ Bounds Checking

✅ Memory Management

✅ Style(9) Compliance

✅ Security Assessment

Technical Details

Architecture

Core Classification Logic

Scheduler Integration

Recommendations

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

laffer1 commented Jun 12, 2026 •

edited by sourcery-ai Bot

Loading

Tunables (`kern.sched.*`)

sourcery-ai Bot commented Jun 12, 2026 •

edited

Loading

github-actions Bot commented Jun 12, 2026 •

edited

Loading