Skip to content

xiaohanma-oss/ECAN-THRML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ECAN-THRML

License: MIT Python 3.10+ Version 0.1.0

Compile ECAN attention flow to Lattice Boltzmann on thermodynamic hardware — bridging metta-attention and Extropic/thrml.

Table of Contents

Overview

ECAN-THRML compiles Hyperon's ECAN (Economic Attention Network) attention diffusion into a D2Q9 Lattice Boltzmann simulation that maps directly onto TSU hardware. You give it an attention field with STI sources; it runs LBM collision and streaming, and returns the steady-state attention distribution. A Cole-Hopf transform layer enables solving Hamilton-Jacobi-Bellman (HJB) equations for goal-directed attention routing on the same LBM→TSU pipeline. LTI (Long-Term Importance) is supported as a passive scalar advected by the STI velocity field (whitepaper §5.4: ∂C/∂t + ∇·(Cu) = 0). 59 tests verify mass conservation, symmetry, convergence, and agreement with upstream metta-attention.

New to ECAN? (30-second primer)

ECAN (Economic Attention Network) is Hyperon's attention allocation mechanism. Each atom in the knowledge graph carries two attention values:

Component Meaning Analogy
STI (Short-Term Importance) How much attention an atom has right now Activation level in a neural network
LTI (Long-Term Importance) How valuable an atom is to keep in memory Frequency of past usefulness

Attention flows like an incompressible fluid: high-STI atoms spread attention to their neighbors, governed by a diffusion rate α. The current default implementation (metta-attention) uses discrete diffusion on CPU; Hyperon's design also envisions incompressible fluid dynamics on GPU. This project compiles the same diffusion to Lattice Boltzmann on thermodynamic hardware.

New to Lattice Boltzmann? (30-second primer)

The Lattice Boltzmann Method (LBM) simulates fluid dynamics on a discrete grid. Instead of solving Navier-Stokes equations directly, each grid point carries particle distribution functions — probabilities of particles moving in each direction.

In D2Q9 (2D grid, 9 velocities):

Step What happens TSU mapping
Collision Distributions relax toward local equilibrium: f_i += (f_eq - f_i) / τ pbit thermal relaxation (hardware-native)
Streaming Each distribution shifts to the neighboring site Physical neighbor wiring (hardwired connections)
Extraction Density and velocity recovered: ρ = Σf_i Readout

The relaxation time τ controls viscosity: ν = (τ - 0.5) / 3. Lower τ means faster diffusion, matching ECAN's attention spread rate.

Technical summary: ECAN models attention as incompressible fluid on a knowledge graph. The D2Q9 LBM collision step (BGK relaxation) maps to TSU pbit thermal relaxation; the streaming step maps to TSU physical neighbor wiring. The bridge module converts between metta-attention's discrete STI diffusion rate α and LBM's relaxation time τ via align_tau. A Cole-Hopf transform layer (hjb.py) linearizes HJB equations into diffusion equations, enabling goal-directed attention routing on the same LBM→TSU pipeline. 59 tests verify that LBM steady states match metta-attention's discrete diffusion within 10%, Cole-Hopf roundtrip correctness, and LTI advection conservation.

Why this matters

Hardware-native attention diffusion

ECAN's attention spread is a diffusion problem — iteratively distributing STI across graph neighbors. Each architecture handles this differently:

CPU (metta-attention) GPU (envisioned) TSU (this project)
Parallelism Sequential per-node Advection-projection Bipartite-parallel relaxation
Bottleneck Node count × iterations Memory bandwidth Mixing time (lattice size)
Best fit Small graphs Large dense graphs Graph topology fits on-chip¹

On a TSU, the attention lattice compiles into collision cells updated via bipartite block-Gibbs passes (half the cells relax simultaneously per pass, two passes per iteration ≈ 2τ₀) — pbit thermal noise drives distributions toward Boltzmann equilibrium, and physical wiring handles streaming.

¹ The TSU uses an L×L grid with ~12 neighbor connections per cell. D2Q9 requires 8 neighbors ⊂ 12, so the hardware connectivity is sufficient. Graphs exceeding a single chip require multi-chip partitioning.

Energy efficiency

The TSU architecture paper (arXiv:2510.23972) reports ~10,000× lower energy per sample vs GPU baselines on image generation benchmarks (DTM vs GPU VAE; E_cell ≈ 2 femtojoules). LBM attention diffusion is expected to benefit similarly but has not been independently benchmarked.

Installation

git clone --recurse-submodules https://github.com/xiaohanma-oss/ECAN-THRML.git
cd ECAN-THRML
pip install -e .                 # core only (jax + numpy)
pip install -e ".[dev]"          # + pytest and matplotlib

The metta-attention submodule provides test baselines. If you cloned without --recurse-submodules, run git submodule update --init.

Quick start

Python API

from ecan_thrml import init_field, add_source, run_lbm, macroscopic

# Create a 64×64 attention field
f = init_field(64, 64, rho0=1.0)

# Add a high-attention "goal node" at center
f = add_source(f, cy=32, cx=32, amount=10.0, radius=3.0)

# Run LBM simulation (tau controls viscosity)
f_final, rho_history = run_lbm(f, tau=0.8, n_steps=200)

# Extract attention density and flow velocity
rho, ux, uy = macroscopic(f_final)

Bridge: metta-attention ↔ LBM

from ecan_thrml.bridge import ring_to_lbm, lbm_to_sti, align_tau

# Convert metta-attention's diffusion rate to LBM relaxation time
tau = align_tau(alpha=0.4, n_neighbors=2)   # → 1.1

# Map STI values to LBM field and back
f = ring_to_lbm([400, 200, 200, 200])      # 4-node ring
# ... run LBM steps ...
sti_values = lbm_to_sti(f)                  # back to STI

HJB solver (goal-directed attention routing)

from ecan_thrml import solve_hjb, value_to_density

# Initial cost field: high cost everywhere
import jax.numpy as jnp
V0 = jnp.full((64, 64), 3.0)

# Solve HJB with a goal node at center (V=0)
V_final, V_history, diag = solve_hjb(V0, nu=0.5, n_steps=100,
                                      goals=[(32, 32, 0.0)])

# Convert value function to attention density
rho = value_to_density(V_final, diag['epsilon'])
# → density peaks at goal, decays with routing distance

Run tests

pytest tests/ -v                 # all 59 tests

How it works

Each grid point carries 9 particle distributions (D2Q9 model). At each time step:

1. Collision — distributions relax toward local equilibrium:
   f_i += (f_eq - f_i) / τ           ← TSU: pbit thermal relaxation

2. Streaming — each f_i shifts to the neighboring site:
   f_i(x + e_i) ← f_i(x)            ← TSU: physical neighbor wiring

3. Extraction — density and velocity recovered:
   ρ = Σ f_i,   u = Σ f_i·e_i / ρ   ← readout

The kinematic viscosity is ν = (τ - 0.5) / 3.

ECAN concept LBM construct TSU hardware
Atom (with STI) Grid point density ρ Sampling cell state
STI value Local density ρ(x) pbit occupation probability
Attention diffusion (α) Viscous flow (ν = (τ-0.5)/3) Thermal relaxation rate
Neighbor spread Streaming: f_i(x) → f_i(x+e_i) Physical neighbor wiring
Local equilibrium BGK collision: f → f_eq pbit Boltzmann relaxation
Attention field D2Q9 distribution functions 9 coupled pbit groups per cell
Steady-state STI Converged density field Thermal equilibrium
Knowledge graph topology Lattice geometry + boundaries On-chip wiring layout
Goal-directed routing cost HJB value function V(x) Cole-Hopf → LBM density
Optimal attention policy −∇V Density gradient (emergent)

Note on TSU mapping: The collision → pbit relaxation and streaming → physical wiring correspondences are this project's interpretive construction. The TSU architecture paper (arXiv:2510.23972) describes p-bit Gibbs sampling on an L×L grid with ~12 sparse neighbor connections; we map BGK collision relaxation onto that update mechanism.

API reference

Engine (ecan_thrml.lbm)

Core operations:

Function Description
equilibrium(rho, ux, uy) Compute equilibrium distribution
collide(f, tau) BGK collision step
collide_passive(g, tau_g, ux_ext, uy_ext) Passive scalar collision (external velocity)
stream(f) Streaming step (periodic boundaries)
step(f, tau) One full LBM step (collide + stream)
step_coupled(f, tau, g, tau_g) Coupled STI + LTI step
macroscopic(f) Extract ρ, ux, uy from distributions

Initialization & sources:

Function Description
init_field(ny, nx, rho0) Create uniform equilibrium field
add_source(f, cy, cx, amount) Inject attention at a point
run_lbm(f, tau, n_steps, g=, tau_g=) Full simulation loop (optional dual-field)

Analysis:

Function Description
total_mass(f) Sum of density (conservation check)
kinetic_energy(f) Total kinetic energy
diagnose_convergence(rho_history) Steady-state detection

Bridge (ecan_thrml.bridge)

Function Description
align_tau(alpha, n_neighbors) Convert metta-attention diffusion rate α → LBM τ
ring_to_lbm(sti_values) Convert ring STI values → LBM equilibrium field
lbm_to_sti(f) Extract STI values from LBM density field
ring_to_lbm_dual(sti_values, lti_values) Convert ring STI+LTI values → dual LBM fields
lbm_to_lti(g) Extract LTI values from passive scalar field

HJB solver (ecan_thrml.hjb)

Solves HJB equations via Cole-Hopf transform: φ = exp(−V/ε) linearizes ∂V/∂t + |∇V|²/2 = ν∇²V into ∂φ/∂t = ν∇²φ (linear diffusion → LBM).

Function Description
tau_from_nu(nu) HJB viscosity ν → LBM relaxation time τ = 3ν + 0.5
default_tau_lti(tau_sti, timescale_ratio) Compute LTI τ for time-scale separation
value_to_density(V, epsilon) Cole-Hopf forward: V → ρ = exp(−V/ε)
density_to_value(rho, epsilon) Cole-Hopf inverse: ρ → V = −ε·ln(ρ)
init_value_field(ny, nx, V0, nu) Initialize LBM field from value function V₀
add_goal(f, cy, cx, goal_value, epsilon) Inject density at a goal node
solve_hjb(V0, nu, n_steps, goals, LTI0=, tau_lti=) Full pipeline with optional LTI advection

Results

59 tests, all passing (~20s on CPU).

Compares LBM steady state against upstream metta-attention discrete diffusion on a 4-node ring (A0=400, A1=A2=A3=200, α=0.4, 20 steps, via PeTTa):

Test Tolerance Status
Ring steady state < 10% relative error Pass
Mass conservation (both systems) < 0.01 Pass
Symmetry (A1 ≈ A3) < 1.0 STI Pass
Convergence direction (all nodes → mean) < 50 STI from mean Pass

Hyperon integration outlook

See PLN-THRML README for the full heterogeneous pipeline design (Control → Compile → Sample).

This project contributes the ECAN tier: LBM collision + streaming compiled to TSU pbit relaxation, handling attention allocation (STI diffusion) alongside PLN's Boltzmann factor-graph sampling. Both workloads are TSU-native — co-location on one chip via time-multiplexing or spatial partitioning is an open question (depends on lattice size, graph partitioning, and mixing time).

Project structure

ecan_thrml/                Main package
  __init__.py              Public API re-exports (lbm + hjb)
  lbm.py                   D2Q9 LBM engine (~300 lines)
  bridge.py                LBM ↔ STI mapping + parameter alignment (align_tau)
  hjb.py                   Cole-Hopf transform layer: HJB ↔ LBM density mapping
vendor/
  metta-attention/         Upstream ECAN baseline (git submodule, run via PeTTa)
tests/
  conftest.py              Fixtures, PeTTa integration, tolerance constants
  test_lbm.py              27 engine unit tests (mass conservation, symmetry, convergence)
  test_ecan.py             4 comparison tests (LBM vs metta-attention ring diffusion)
  test_hjb.py              14 HJB solver tests (Cole-Hopf roundtrip, goals, conservation)
  test_lti.py              14 LTI advection tests (passive scalar, mass conservation, advection)
docs/
  interactive_overview.html  Interactive visualization (open in browser)
  ecan_navier_stokes.html  ECAN ↔ Navier-Stokes deep mapping visualization
  internal/                Local visualizations (not committed)
  references/              Shared research papers (git submodule)

Contributing

See CONTRIBUTING.md for development setup, testing, code conventions, and pull request guidelines.

Sister Projects

Five projects compiling Hyperon's cognitive architecture to thermodynamic hardware:

Project What it compiles
PLN-THRML Probabilistic inference → Boltzmann energy tables
ECAN-THRML Attention diffusion → Lattice Boltzmann simulation
MOSES-THRML Program evolution → Boltzmann sampling
QuantiMORK-THRML Predictive coding → wavelet-sparse factor graphs
Geodesic-THRML Unified geodesic scheduler for all above

Acknowledgements

License

MIT — Copyright (c) 2026 Xiaohan Ma

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages