Empirical validation of the Centaurian Hybrid Architecture (CHA), a multi-layered AI system that pairs a symbolic/quantum cognitive core with a lightweight neural periphery (small language models, neural TTS, procedural animation). The architecture encodes personality via a Quantum Personality Model (QPM) running on classical hardware and confines neural networks to bounded I/O transduction roles, preserving end-to-end traceability of every behavioral decision.
Architecture specifications:
Centaurian_Hybrid_Architecture_v3.md— current specCentaurian_Hybrid_Architecture_v2.md— previous version (kept for reference)Interpretable_Architectures_Revised_v1.md— interpretable-architectures position paper
The empirical work in this repository validates the Self-Model Component (SMC) sub-architecture — i.e., whether a 7B SLM can serve as the linguistic transducer and hold an Aria-grade Structured Cognitive Identity (SCI) reliably across long conversations.
Goal: Measure how long a small language model can maintain a consistent persona when given a Structured Cognitive Identity (SCI), find the degradation inflection point T*, and test six prompt-time strategies for closing the gap to the 3.5 PersonaScore threshold.
Method:
- 30 scripted dialogues (22 naturalistic + 8 adversarial), each 40 turns
- Side-channel probe questions at turns 5, 10, 15, …, 40 across 4 dimensions (Trait, Episodic, Capability, Style)
- Primary judge: Claude Sonnet 4.5; secondary judge: Sonnet 4.5 (intra-model consistency via quadratic-weighted Cohen's kappa)
| Model | Params | T* | Mean PersonaScore | Outcome |
|---|---|---|---|---|
| Phi-4-mini | 3.8B | 5 (immediate) | 1.08 / 5.0 | Capability failure — gibberish in 93% of scripts |
| Qwen2.5-7B | 7B | 5 | 3.06 (3.16 → 2.96 over 40 turns) | Coherent but below threshold; piecewise degradation from turn 15 |
| Strategy | Mean | β (deg. rate) | E dim | Notes |
|---|---|---|---|---|
| Baseline | 3.08 | 0.008 | 2.37 | Reference |
| SCI Refresh (turn 13) | 3.15 | 0.008 | 2.37 | Eliminates inflection; fades by turn 35 |
| Episodic RAG | 3.15 | 0.002 | 2.69 | 4× slower decay, but trait destabilization |
| Hybrid RAG | 3.17 | ~0 | 2.83 | Best E result; trait/style regression |
| Combined (Refresh + RAG) | 3.20 | ~0 | 2.76 | Late-conversation peak; emergent t40 > t5 |
| Multi-Refresh (turns 13 + 28) | 3.20 | ~0 | 2.43 | Best trait/style; lowest total failures |
Phase 2 takeaway: all six strategies converge in [3.08, 3.20] — a 0.12-point band, far below the 3.5 threshold. No condition crosses E = 3.0. The closing question for Phase 3: is the residual gap architectural (need 14B) or capability-shaped (LoRA can fix)?
Full Phase 1 + 2 report: CHA_Experiment_1/EXPERIMENT_REPORT.md
Goal: Test whether LoRA fine-tuning on persona-consistent dialogue closes the gap that survived all six SCI strategies in Experiment 1, and specifically whether it resolves episodic fabrication at 7B.
Method:
- 4-condition design (A: FT, no SCI; B: FT, baseline SCI; C: FT + Combined SCI; D: base + Combined SCI as Exp 1 replication) on the same 30 scripts
- Three LoRA adapters trained on a synthetic 10K-example dataset: LoRA-2K, LoRA-5K, LoRA-10K (eval losses 0.91 / 0.77 / 0.69)
- H5 sub-runs swap the adapter under Condition C config to characterize the data-scaling curve
- QLoRA: 4-bit NF4 base + BF16 adapters, r=16 α=32, target q/k/v/o + gate/up, A100 80GB
| Condition | Description | Mean PersonaScore |
|---|---|---|
| A | FT, no SCI | 4.020 |
| B | FT, baseline SCI | 4.293 |
| C | FT + Combined SCI | 4.415 |
| D | Base + Combined SCI (replication) | 3.224 |
- H1 PASSED ✓ — C exceeds the 3.5 threshold by +0.92 points
- H2 PASSED ✓ — ΔE = +0.579 (vs +0.30 threshold for "fine-tuning meaningfully addresses fabrication")
- Paired test (C vs D): Cohen's d = 7.51, p ≈ 1.4 × 10⁻²³ on 30 scripts
- Replication check: D = 3.224 vs Exp 1's 3.20, |Δ| = 0.024 (within ±0.10 tolerance) → judge stable
- Decision Rule Outcome A triggered: SMC sub-architecture complete at 7B; the planned 14B model test is retired from the critical path
Full Experiment 2 report: CHA_Experiment_2/EXPERIMENT_REPORT.md
Centaurian_Hybrid_Architecture_v3.md # Current architecture spec
Centaurian_Hybrid_Architecture_v2.md # Previous architecture spec
Interpretable_Architectures_Revised_v1.md # Position paper
CHA_Experiment_1/
├── EXPERIMENT_REPORT.md # Full Phase 1 + 2 report
├── experiment_runner.py # Main experiment pipeline
├── generate_scripts.py # Template-based script generator
├── analyse_results.py # Analysis and visualization
├── interrater_check.py # Inter-rater reliability checker
├── CHA_Experiment1_Colab.ipynb # Google Colab notebook
├── logs_qwen2.5_7b{,_refresh13,_refresh13_28,
│ _episodic_rag,_episodic_rag_hybrid,
│ _refresh13_episodic_rag}/ # Per-condition score & context logs
└── results_qwen2.5_7b*/ # Charts, fits, summary reports
CHA_Experiment_2/
├── EXPERIMENT_REPORT.md # Full Experiment 2 report
├── CHA_Experiment2_Plan.md # Pre-registered plan
├── cha_assets.py # Shared persona, probes, rubrics, RAG helpers
├── generate_lora_dataset.py # Sonnet 4.6 dataset generator with QC
├── train_lora_sci.py # QLoRA training (transformers + PEFT + TRL)
├── experiment_runner.py # 4-condition + H5 evaluator (HF + PEFT)
├── analyse_results.py # Multi-condition analysis + plots
├── make_slides.py # Generates the conference deck (python-pptx)
├── CHA_Experiment2_Colab.ipynb # Google Colab notebook
├── h4_probes.json # H4 base-capability probe set (100 prompts × 5 categories)
├── run_h4.py # H4 runner — base vs LoRA-10K on out-of-domain probes
├── analyse_h4.py # H4 analysis — paired t-test + per-category degradation
├── data/full.jsonl # 10K training examples (QC-passed)
├── adapters/lora_{2k,5k,10k}/ # LoRA adapter weights (gitignored)
├── logs/condition_{A,B,C,D,
│ C_lora_2k,C_lora_5k}/ # Per-condition score & context logs
├── logs/h4_{base,lora}/ # H4 base-capability test logs
└── results/ # Plots, analysis_data.json, summary report
Prerequisites: Python 3.10+, an Anthropic API key, and access to GPU compute (Colab Pro recommended for training; T4 sufficient for Experiment 1 evaluation; A100 80GB needed for Experiment 2).
cd CHA_Experiment_1
pip install ollama anthropic python-dotenv numpy scipy matplotlib
echo "CHA_EXPERIMENT_SONNET_KEY=sk-..." > .env
python generate_scripts.py
python experiment_runner.py --model qwen2.5:7b
python analyse_results.py --model qwen2.5:7b
# Phase 2 interventions
python experiment_runner.py --model qwen2.5:7b --refresh-turn 13
python experiment_runner.py --model qwen2.5:7b --episodic-rag
python experiment_runner.py --model qwen2.5:7b --refresh-turn 13 --episodic-rag
python experiment_runner.py --model qwen2.5:7b --refresh-turns 13,28Or use CHA_Experiment_1/CHA_Experiment1_Colab.ipynb for GPU-accelerated runs.
cd CHA_Experiment_2
pip install transformers peft trl bitsandbytes accelerate datasets \
anthropic python-dotenv sentence-transformers numpy scipy matplotlib
# Dataset generation (~12 hrs, ~$80 in API calls)
python generate_lora_dataset.py --target 10000
# Train adapters (Colab A100 80GB)
python train_lora_sci.py --train-rows 2000 --output adapters/lora_2k
python train_lora_sci.py --train-rows 5000 --output adapters/lora_5k
python train_lora_sci.py --train-rows 10000 --output adapters/lora_10k
# 4-condition evaluation + H5 sub-runs
python experiment_runner.py --condition A --adapter adapters/lora_10k
python experiment_runner.py --condition B --adapter adapters/lora_10k
python experiment_runner.py --condition C --adapter adapters/lora_10k
python experiment_runner.py --condition D --adapter ""
python experiment_runner.py --condition C --adapter adapters/lora_2k --logs-suffix lora_2k
python experiment_runner.py --condition C --adapter adapters/lora_5k --logs-suffix lora_5k
# Analysis
python analyse_results.pyOr use CHA_Experiment_2/CHA_Experiment2_Colab.ipynb for the full pipeline end-to-end.
Out-of-domain probe battery (100 prompts × 5 categories: general knowledge, code reasoning, math, instruction following, structured intent JSON) verifying that the LoRA-10K adapter does not cause catastrophic forgetting. Pass criterion: < 5% mean degradation on Sonnet 4.5 1-5 scoring.
cd CHA_Experiment_2
# Generate responses from both conditions (A100 80GB, ~30 min each)
python run_h4.py --condition base
python run_h4.py --condition lora --adapter lora_10k
# Analysis (paired t-test, per-category degradation, verdict)
python analyse_h4.pyOutputs results/h4_summary_report.md, results/h4_analysis_data.json, and two comparison plots.
See repository for license details.