Skip to content

sreevadde/reftrace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RefTrace

Referee Decision Tracing via Agentic Graph-of-Thought Reasoning over Sports Broadcast Video

Python 3.10+ License: MIT arXiv

This repository contains the reference implementation for the RefTrace research paper. It provides the complete agentic reasoning architecture, training pipeline, evaluation framework, and data preparation tools described in the paper.

RefTrace is the second paper in the sports adjudication trilogy:

  1. RuleGround — Perception to predicates: grounds raw video into structured game-state predicates
  2. RefTrace (this repo) — Evidence to traces: generates verifiable reasoning traces over a rule knowledge graph
  3. StateTrace — Traces to state transitions: adds explicit state bottleneck for per-step verification

Overview

RefTrace is a sport-agnostic agentic framework for automated referee decision analysis. The system combines a Game-State Trace Hypergraph (GSTH) -- a structured knowledge representation encoding sport rules, game state, and temporal play context -- with an Agentic Graph-of-Thought (AGoT) reasoning process. A vision-language policy (Qwen3-VL + heterogeneous GAT) navigates the GSTH through a typed action space, selecting relevant subgraphs and applying rule logic to reach justified penalty/no-penalty decisions. Training uses a two-stage pipeline: supervised fine-tuning (SFT) on expert traces, followed by Grounded Reinforcement Policy Optimization (GRPO) with a 7-component Normalized Trace Reward (NTR). We validate on NFL broadcast video using the NFL-MH benchmark.

Key Results on NFL-MH-Core

Configuration VTA DA RC-F1
RefTrace-7B 72.1 89.2 --
w/o NTR (ablation) 61.7 -- --
GRPO-only 52.3 -- --

Architecture

---
config:
  layout: elk
  look: neo
  theme: neo
---
flowchart TB
    Q["<b>Query</b><br>(play description)"] --> QE
    Video["<b>Video Frames</b><br>[B, T, C, H, W]"] --> VLM

    subgraph Policy ["RefTracePolicy"]
        VLM["<b>QwenVLBackbone</b><br>Qwen3-VL-8B"] --> GE["<b>GraphEncoder</b><br>Heterogeneous GAT"]
        QE["<b>QueryEncoder</b><br>Sentence-T5"]
        HE["<b>HistoryEncoder</b><br>Transformer"]
        GE & QE & HE --> F["<b>FusionLayer</b>"]
        F --> AH["<b>ActionHead</b>"]
    end

    subgraph Act ["Action Space"]
        direction LR
        A1["GET_RULE"] & A2["GET_STATE"] & A3["GET_EVENTS"]
        A4["GET_VIDEO"] & A5["VERIFY_TEMPORAL"] & A6["STOP"]
    end

    AH --> Act
    Act -->|"execute"| GSTH["<b>GSTH</b><br>Game-State Trace Hypergraph<br>4 node types · 6 edge types"]
    GSTH -->|"result"| HE
    A6 -->|"decision"| D["<b>CALL_CORRECT · CALL_INCORRECT · NO_FOUL</b>"]
Loading

Quick Start

# Clone and install
git clone https://github.com/sreevadde/reftrace.git && cd reftrace
pip install -e ".[dev]"

# Build the Game-State Trace Hypergraph
reftrace build-gsth --config configs/base.yaml

# Train Stage 1: Supervised Fine-Tuning
reftrace train --config configs/training/sft.yaml

# Train Stage 2: GRPO with NTR
reftrace train --config configs/training/grpo.yaml

# Evaluate on NFL-MH-Core
reftrace eval --config configs/nfl/core.yaml \
    --checkpoint outputs/grpo/grpo_checkpoint_final.pt

Python API

from reftrace.models import RefTracePolicy
from reftrace.graph import GSTH

# Load a pre-built GSTH
gsth = GSTH.load("data/gsth/gsth.pkl")
pyg_data = gsth.to_pyg()

# Build the policy from config
policy = RefTracePolicy.from_config(cfg)

# Run a single reasoning step
dist = policy(
    query="Was the defensive pass interference call correct?",
    gsth_data=pyg_data,
    video_frames=video_tensor,      # (B, T, C, H, W)
    history_pairs=history_tensor,    # (B, T_hist, 2*D)
)

# Sample an action
action = policy.select_action(
    query="Was the defensive pass interference call correct?",
    gsth_data=pyg_data,
    video_frames=video_tensor,
)

Configuration

RefTrace uses OmegaConf for hierarchical YAML configuration with CLI overrides.

configs/
├── base.yaml               # Default hyperparameters for all components
├── model/
│   ├── base.yaml            # Qwen3-VL-8B (default) + LoRA rank 64
│   ├── large.yaml           # Qwen3-VL-32B
│   ├── small.yaml           # Qwen3-VL-2B
│   ├── qwen25.yaml          # Qwen2.5-VL-7B (paper baseline, for reproducibility)
│   └── qwen35.yaml          # Qwen3.5-9B (latest, unified VL)
├── training/
│   ├── sft.yaml             # Stage 1: supervised fine-tuning
│   └── grpo.yaml            # Stage 2: GRPO with NTR
└── nfl/
    ├── core.yaml            # NFL-MH-Core (frame-accurate, expert-labeled)
    ├── auto.yaml            # NFL-MH-Auto (broadcast-scale, auto-extracted)
    └── combined.yaml        # Combined Core + Auto

Override any parameter from the command line:

reftrace train --config configs/training/sft.yaml \
    --set training.lr=1e-5 \
    --set training.batch_size=8

Reproduction

Hardware Requirements

  • Training: 4x NVIDIA A100 80GB (SFT ~6 hours, GRPO ~10 hours)
  • Inference: 1x A100 40GB (or 2x A6000)
  • GSTH construction: CPU-only, ~15 minutes

Reproducing Paper Results

# Build GSTH from play-by-play data
reftrace build-gsth --config configs/base.yaml

# Stage 1: SFT on expert reasoning traces
reftrace train --config configs/training/sft.yaml

# Stage 2: GRPO with Normalized Trace Reward
reftrace train --config configs/training/grpo.yaml

# Evaluate on NFL-MH-Core test set
reftrace eval --config configs/nfl/core.yaml \
    --checkpoint outputs/grpo/grpo_checkpoint_final.pt

# Evaluate on NFL-MH-Auto test set
reftrace eval --config configs/nfl/auto.yaml \
    --checkpoint outputs/grpo/grpo_checkpoint_final.pt

Results are deterministic given fixed seeds (training.seed=42). Use --set training.seed={43,44} for the additional seeds reported in the paper.


Citation

@article{vadde2026reftrace,
  title   = {RefTrace: Referee Decision Tracing via Agentic Graph-of-Thought
             Reasoning over Sports Broadcast Video},
  author  = {Vadde, Sree Krishna},
  journal = {arXiv preprint},
  year    = {2026}
}

License

MIT

About

Reference implementation for "RefTrace: Grounded Normative Reasoning via Retrieval-Augmented Agentic Graph-of-Thought for Multimodal Video Adjudication." Agentic retrieval-loop architecture with Game-State Temporal Hypergraph (GSTH), 6-action typed grammar, Normalized Trace Reward (NTR), and two-stage SFT→GRPO training for NFL penalty adjudication.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages