Skip to content

it seem not work #1

@aliasGitee

Description

@aliasGitee

root@ly2026050700229-8474747479-rv525:~/TIDE# python examples/quickstart.py --model /root/Qwen3-8B
[TIDE INFO] CUDA kernels loaded via torch.ops.load_library
Loading /root/Qwen3-8B...
torch_dtype is deprecated! Use dtype instead!
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:09<00:00, 1.97s/it]
Calibrating routers (200 samples)...
[TIDE INFO] === TIDE Calibration ===
[TIDE INFO] Config: interval=4, threshold=0.98
[TIDE INFO] Step 1/3: Collecting hidden states...
[TIDE INFO] No registered adapter for 'Qwen3ForCausalLM', trying UniversalAdapter
[TIDE INFO] UniversalAdapter probed Qwen3ForCausalLM: 36 layers, hidden_dim=4096
README.md: 10.5kB [00:00, 3.40MB/s]
[TIDE INFO] Collected 200 calibration texts
[TIDE INFO] Collected hidden states at 9 checkpoints, 33671 total tokens
[TIDE INFO] Step 2/3: Computing convergence labels...
[TIDE INFO] Layer 3: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 7: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 11: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 15: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 19: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 23: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 27: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 31: 0.0% tokens converged (cosine > 0.98)
[TIDE INFO] Layer 35: 100.0% tokens converged (cosine > 0.98)
[TIDE INFO] Step 3/3: Training routers...
[TIDE INFO] Layer 3 epoch 25: loss=0.0001 acc=1.000
[TIDE INFO] Layer 3 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 3 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 3 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 3 final loss: 0.0000
[TIDE INFO] Layer 7 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 7 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 7 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 7 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 7 final loss: 0.0000
[TIDE INFO] Layer 11 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 11 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 11 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 11 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 11 final loss: 0.0000
[TIDE INFO] Layer 15 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 15 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 15 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 15 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 15 final loss: 0.0000
[TIDE INFO] Layer 19 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 19 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 19 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 19 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 19 final loss: 0.0000
[TIDE INFO] Layer 23 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 23 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 23 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 23 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 23 final loss: 0.0000
[TIDE INFO] Layer 27 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 27 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 27 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 27 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 27 final loss: 0.0000
[TIDE INFO] Layer 31 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 31 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 31 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 31 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 31 final loss: 0.0000
[TIDE INFO] Layer 35 epoch 25: loss=0.0000 acc=1.000
[TIDE INFO] Layer 35 epoch 50: loss=0.0000 acc=1.000
[TIDE INFO] Layer 35 epoch 75: loss=0.0000 acc=1.000
[TIDE INFO] Layer 35 epoch 100: loss=0.0000 acc=1.000
[TIDE INFO] Layer 35 final loss: 0.0000
[TIDE INFO] Saved router checkpoint to router.pt
Saved to router.pt
[TIDE INFO] No registered adapter for 'Qwen3ForCausalLM', trying UniversalAdapter
[TIDE INFO] UniversalAdapter probed Qwen3ForCausalLM: 36 layers, hidden_dim=4096
[TIDE INFO] TIDERuntime initialized: 36 layers, 9 routers, CUDA=on
[TIDE INFO] Generation: 128 tokens, 128 exits (100.0%), estimated 1.00x equivalent speedup

============================================================
Explain how transformers work in simple terms: the basic concept of the inductance process in the primary and the secondary coiled wire parts the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the process the

Total tokens: 128, Exited: 128 (100.0%)
Layer 35: 128 exits (100.0%)
Ran all layers: 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions