Skip to content

sunghunkwag/ast-grammar-induction-prototype

Repository files navigation

ast-grammar-induction-prototype

Research Note: This is a single-file recursive self-improvement engine. It is highly fragile and tends to converge on trivial solutions or collapse into syntax errors. It is an exploration of whether statistical grammar learning can rescue a random search process.

AST-Guided Grammar Induction Prototype (Experimental)

This repository contains a prototype for an Estimation of Distribution Algorithm (EDA) that operates directly on Python Abstract Syntax Trees (ASTs). The objective is to see if a system can autonomously refine its own mutation distributions based on limited "success" in a simulator (ARC Gymnasium).

Core Components (L0-L5 Scaling)

  • L0-L2 Hyperparameter Tuning: Relatively stable. Adjusts population sizes and mutation rates.
  • L3-L5 Operator Evolution: Highly unstable. Attempts to rewrite the actual mutation logic (ssm_mutate). Frequently results in non-terminating code or segmentation faults if not strictly sandboxed.

Experimental Environment: ARC Gymnasium

  • Loads local JSON files to provide a fitness landscape.
  • Scans ARC_GYM directory for tasks.
  • Status: Most tasks remain unsolvable by this approach due to the search space being too large for random AST perturbations.

Safety & Fragility Architecture

Since the system generates and executes its own code at runtime, a multi-layer safety harness is required:

  • Loop Injection: Forcibly inserts step_count checks into every for/while loop to prevent infinite recursion.
  • Whitelisted Built-ins: The "Self-Improvement" logic is restricted to a subset of Python to prevent it from deleting the host operating system.
  • Rollback System: Necessary as most "improvements" generated by the L5 logic result in immediate syntax or logic failures.

Experimental Observations

  • Grammar Bias: We observed that the EDA can learn to favor certain AST nodes (e.g., swapping constants for variables). However, this bias often becomes too strong (overfitting to a single task), preventing generalization.
  • Omega Point Convergence: The "Omega Point" (a state of stable recursive self-improvement) has not been reached. The system usually hits a complexity wall where further mutation only increases entropy.

Usage (Experimental)

Warning: Do not run this on sensitive systems. Even with whitelisting, the engine is designed to mutate its own code path.

# Run the self-check (Verify the sandbox is active)
python omega_point.py --self-check

# Run a limited evolution round
python -m hybrid_bridge.run_round --max_generations 100

Citation

If referencing this experiment on recursive self-improvement fragility:

@software{rsi_experiment_2025,
  author = {Kwag, Sunghun},
  title = {AST-Guided Grammar Induction: An Experiment in Fragile RSI},
  year = {2025},
  url = {https://github.com/sunghunkwag/ast-grammar-induction-prototype}
}

License

MIT License

About

A single-file recursive self-improvement engine that evolves Python programs through AST analysis and statistical grammar learning (EDA).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors