Skip to content

Reduce cache footprint by decoupling degeneracy-dependent data#387

Open
lkdvos wants to merge 12 commits intomainfrom
ld-caching
Open

Reduce cache footprint by decoupling degeneracy-dependent data#387
lkdvos wants to merge 12 commits intomainfrom
ld-caching

Conversation

@lkdvos
Copy link
Member

@lkdvos lkdvos commented Mar 23, 2026

Refactors the internal block structure representation for TensorMap by splitting FusionBlockStructure into two separately-cached structs:

  • SectorStructure (cached by sector structure, shared across spaces with the same sectors): coupled sectors and valid fusion tree pairs as Dictionaries.Indices.
  • DegeneracyStructure (cached per HomSpace): total dimension, block sizes/ranges, and sub-block strides as plain Vectors.

The public blockstructure(W) and subblockstructure(W) combine these on the fly into a Dictionary.

Previously, FusionBlockStructure stored a fusiontreelist, a fusiontreeindices lookup dict, and a fusiontreestructure vector — the tree pairs were stored twice and lookup required manual indirection. Dictionaries.jl handles this naturally via its token system.

A secondary benefit: DegeneracyStructure{N} is parameterized only by the number of indices N, not the sector type. Kernels that only need sizes/strides/offsets can therefore be compiled once and reused across different symmetry groups — a potential route to much faster precompilation.

Also moves the tensor structure computation out of homspace.jl into a new tensorstructure.jl.


Open questions:

  • Should we replace all remaining internal dicts with Dictionaries.jl types to avoid maintaining two APIs?
  • Are there other places that would benefit from a similar sector/degeneracy split?

@lkdvos lkdvos linked an issue Mar 23, 2026 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 96.31579% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/tensors/tensorstructure.jl 96.45% 5 Missing ⚠️
src/tensors/braidingtensor.jl 0.00% 2 Missing ⚠️
Files with missing lines Coverage Δ
ext/TensorKitMooncakeExt/utility.jl 85.71% <ø> (ø)
src/TensorKit.jl 14.28% <ø> (+0.64%) ⬆️
src/auxiliary/dicts.jl 47.80% <100.00%> (-3.62%) ⬇️
src/spaces/homspace.jl 90.19% <100.00%> (-4.71%) ⬇️
src/tensors/abstracttensor.jl 47.16% <100.00%> (-7.72%) ⬇️
src/tensors/tensor.jl 78.61% <100.00%> (-6.79%) ⬇️
src/tensors/tensoroperations.jl 93.78% <100.00%> (-3.65%) ⬇️
src/tensors/treetransformers.jl 96.29% <100.00%> (+0.06%) ⬆️
src/tensors/braidingtensor.jl 0.00% <0.00%> (-68.30%) ⬇️
src/tensors/tensorstructure.jl 96.45% <96.45%> (ø)

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Base.parent(h::Hashed) = h.val

# hash overload
Base.hash(h::Hashed{T, Hash}, seed::UInt) where {T, Hash} = Hash(parent(h), seed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit confusing if type parameters encode a function, that they still use capital letters. I was looking for a type called Hash, before realising that this is actually the custom hash function encoded in the type. I understand that using hash for the type parameter is out of the question.

A related question is whether you know if there is any benefits or downsides of this design compared to using the function as a field, i.e.

struct Hashed{T, H, E}
    val::T
    hashf::H
    eqf::E
end

and then doing things like h.hashf(parent(h), seed). I am asking because that is how InnerProductVec over at KrylovKit works, and if it has downsides (other than the stylistic differences), I should probably change this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't think it does, if these are singleton types (like Function) the size remains the same, so AFAIK this is equivalent and just a style choice. I'll refactor along with the required rebase :)

@@ -0,0 +1,276 @@
# Block and fusion tree ranges: structure information for building tensors
#--------------------------------------------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though some of this information is very much tailored towards the tensor construction, in my mind spaces come before tensors, and could in principle exist on their own. However, the current code in the spaces subfolder is broken without the definitions in this file, so I would prefer this file to live in the spaces folder. (There is not a single mention of TensorMap in this file).

end
for (w₁, w₂) in zip(domain(W₁), domain(W₂))
isdual(w₁) == isdual(w₂) || return false
isequal(sectors(w₁), sectors(w₂)) || return false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the return value of sectors was really intended for direct comparison:

julia> isequal(sectors(SU2Space(0=>2)), sectors(SU2Space(0=>2)))
false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe isempty(symdiff(sectors(w₁), sectors(w₂))) ?

Comment on lines +155 to +157
bs = Indices{I}()
for s in sectors(P), c in ⊗(s...)
set!(bs, c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this covers the N==0 case, and is also overly complicated for the N==1 case.

Comment on lines +84 to +103
for c in _blocksectors(W)
push!(bs, c)
codom_start = length(trees) + 1
n₁ = 0
for f₂ in fusiontrees(dom, c)
if n₁ == 0
# First f₂ for this sector: enumerate codomain trees and record how many there are.
for f₁ in fusiontrees(codom, c)
push!(trees, (f₁, f₂))
end
n₁ = length(trees) - codom_start + 1
else
# Subsequent f₂s: the codomain trees are already in the list at
# codom_start:codom_start+n₁-1, so read them back instead of recomputing.
for j in codom_start:(codom_start + n₁ - 1)
push!(trees, (trees[j][1], f₂))
end
end
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say there are quite a few unnecessary +1 and -1s because of your choice of using the starting index, instead of the offset:

Suggested change
for c in _blocksectors(W)
push!(bs, c)
codom_start = length(trees) + 1
n₁ = 0
for f₂ in fusiontrees(dom, c)
if n₁ == 0
# First f₂ for this sector: enumerate codomain trees and record how many there are.
for f₁ in fusiontrees(codom, c)
push!(trees, (f₁, f₂))
end
n₁ = length(trees) - codom_start + 1
else
# Subsequent f₂s: the codomain trees are already in the list at
# codom_start:codom_start+n₁-1, so read them back instead of recomputing.
for j in codom_start:(codom_start + n₁ - 1)
push!(trees, (trees[j][1], f₂))
end
end
end
end
for c in _blocksectors(W)
push!(bs, c)
offset = length(trees)
n₁ = 0
for f₂ in fusiontrees(dom, c)
if n₁ == 0
# First f₂ for this sector: enumerate codomain trees and record how many there are.
for f₁ in fusiontrees(codom, c)
push!(trees, (f₁, f₂))
end
n₁ = length(trees) - offset
else
# Subsequent f₂s: the codomain trees are already in the list at
# offset .+ (1:n₁), so read them back instead of recomputing.
for j in offset .+ (1:n₁)
push!(trees, (trees[j][1], f₂))
end
end
end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fusionblockstructure should reuse data that is degeneracy-independent

2 participants