Add cargo-udeps dead code detection#20
Open
Jackson57279 wants to merge 2 commits into
Open
Conversation
Wire unused-dependency checks into make, CI, and contributor docs so unused Cargo deps are caught before merge. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
1 issue found across 4 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/check-udeps.sh">
<violation number="1" location="scripts/check-udeps.sh:19">
P1: Add `--all-features` to the udeps invocation; otherwise feature-gated dependencies are skipped and can be misreported as unused.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| cargo install cargo-udeps --locked | ||
| fi | ||
|
|
||
| cargo +nightly udeps --workspace --all-targets "$@" |
There was a problem hiding this comment.
P1: Add --all-features to the udeps invocation; otherwise feature-gated dependencies are skipped and can be misreported as unused.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/check-udeps.sh, line 19:
<comment>Add `--all-features` to the udeps invocation; otherwise feature-gated dependencies are skipped and can be misreported as unused.</comment>
<file context>
@@ -0,0 +1,19 @@
+ cargo install cargo-udeps --locked
+fi
+
+cargo +nightly udeps --workspace --all-targets "$@"
</file context>
Whole-engine perf sweep. Verified against existing bit-exact GEMM/ attention tests (595 pass; the 2 failing tests — kv_cache dtype sizing and a Makefile/wasm check — are pre-existing and unrelated). - flash_attention: dot_product_f32 avx512/avx2 use 4 independent accumulators + a 16/8-wide remainder loop, breaking the single-chain FMA latency bottleneck on short head_dim (64/96/128) loops; f16 dot uses 2 accumulators. - flash_attention: vectorize f32 KvElem::axpy (decode V-accumulation) with AVX-512/AVX2 FMA instead of a scalar loop. - tensor/kernels: add dot4_f32_avx512 / dot_f32_avx512 (16-wide) and dispatch the Q4_K/Q6_K decode-once GEMM and dot_f32_fast to them when avx512f+vl are present. Doubles dot lanes on the hottest quantized matmul path for AVX-512 hardware (Skylake-SP target). - tensor/kernels: gemm_f32_cpu inner loop replaced its autovectorization- blocking black_box "prefetch" with SIMD dot_product_f32 over the contiguous transposed column. - Cargo.toml: codegen-units = 1 for whole-crate inlining/LTO scope. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="oxidize-core/src/compute/tensor/kernels.rs">
<violation number="1" location="oxidize-core/src/compute/tensor/kernels.rs:559">
P1: AVX-512 dispatch check is weaker than callee target-feature requirements. Add `avx2` and `fma` to the runtime gate (or drop them from target_feature) to avoid illegal-instruction UB.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| debug_assert_eq!(a.len(), b.len()); | ||
| #[cfg(any(target_arch = "x86", target_arch = "x86_64"))] | ||
| { | ||
| if is_x86_feature_detected!("avx512f") && is_x86_feature_detected!("avx512vl") { |
There was a problem hiding this comment.
P1: AVX-512 dispatch check is weaker than callee target-feature requirements. Add avx2 and fma to the runtime gate (or drop them from target_feature) to avoid illegal-instruction UB.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At oxidize-core/src/compute/tensor/kernels.rs, line 559:
<comment>AVX-512 dispatch check is weaker than callee target-feature requirements. Add `avx2` and `fma` to the runtime gate (or drop them from target_feature) to avoid illegal-instruction UB.</comment>
<file context>
@@ -478,11 +478,87 @@ unsafe fn dot4_f32_avx2(
debug_assert_eq!(a.len(), b.len());
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
{
+ if is_x86_feature_detected!("avx512f") && is_x86_feature_detected!("avx512vl") {
+ return unsafe { dot_f32_avx512(a.as_ptr(), b.as_ptr(), a.len()) };
+ }
</file context>
Suggested change
| if is_x86_feature_detected!("avx512f") && is_x86_feature_detected!("avx512vl") { | |
| if is_x86_feature_detected!("avx512f") | |
| && is_x86_feature_detected!("avx512vl") | |
| && is_x86_feature_detected!("avx2") | |
| && is_x86_feature_detected!("fma") | |
| { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cargo-udepsto detect unused Cargo dependencies across the Rust workspacescripts/check-udeps.shandmake udepsfor local runs (requires nightly)dead-codeCI job on Ubuntumake udepsinmake ciand documents the check in CONTRIBUTING.mdMotivation
Agent Readiness flagged missing dead-code detection tooling. This wires in the standard Rust approach (
cargo-udeps) so unused dependencies are caught in CI and locally before merge.Testing
make udeps— passes on master (All deps seem to have been used.)scripts/check-udeps.sh— executable, installscargo-udepswhen missingFollow-ups
cargo-udepsto pre-commit (pre-push stage) once.pre-commit-config.yamllands on masterMade with Cursor
Summary by cubic
Adds workspace-wide
cargo-udepschecks (CI +make udepsviascripts/check-udeps.shon nightly) and refactors hot dot/GEMM paths with AVX‑512/AVX2 kernels to speed up attention and matmuls.axpy;gemm_f32_cpunow uses SIMD dot; setcodegen-units = 1for better inlining/LTO.Written for commit 8ba9ad3. Summary will update on new commits.