10 Apr 22:33

LZL0

4182211

🏁 L0 (Python) v0.21.0 - Streaming Performance Overhaul, Guardrail Optimizations, and Drift Efficiency Latest

Latest

This release is a major internal performance upgrade for the Python runtime.

No API changes — but substantial improvements to:

streaming efficiency (O(n²) → O(n))
guardrail execution cost
drift detection memory + speed
event + callback overhead

Net result: faster, more scalable streaming with lower overhead across the entire pipeline.

✨ Highlights

1. O(n) Token Accumulation (Major Performance Fix)

String concatenation during streaming has been replaced with a buffered approach.

Before:

state.content += token  # O(n²) over time

Now:

Tokens appended to _content_buffer
Joined lazily via a descriptor (_ContentDescriptor)
Flushed only when state.content is read

Result:

O(n) total complexity
Dramatically better performance for long streams
Reduced memory churn

2. Drift Detection: Sliding Window + Bounded Memory

Drift detection has been rewritten to avoid unbounded growth.

Changes:

list → deque(maxlen=N) for:
- entropy tracking
- token history
Only stores a window, not full content
Uses last_window instead of full last_content

Impact:

Stable memory usage
Faster drift checks
Better scalability on long-running streams

3. Guardrails: Significant Runtime Optimizations

JSON Guardrail

Adds is_json_content caching
Avoids repeated looks_like_json() calls
Resets cache correctly on stream resets

Markdown Guardrail

Skips all analysis during streaming
Only runs on completion

Pattern Guardrail (Major Change)

Precompiles all patterns into a single regex
Uses incremental scanning:
- scans only new content (+ small overlap)
- full scan only on completion

Result:

From repeated full scans → near O(delta)
Much lower overhead on large streams

4. Runtime Hot Path Optimizations

Callback Execution

Skips function calls when callbacks are None
Reduces overhead per token

Observability Events

Guardrail observability only runs if handlers exist
Avoids unnecessary timing + event construction

Buffer Reset Fixes

_content_buffer now cleared correctly on:
- retries
- checkpoint resets

5. Improved Checkpoint + State Handling

Ensures buffer and content stay in sync during:
- retries
- invalid checkpoint recovery
Prevents subtle duplication or stale state issues

6. Updated Benchmarks (Python 3.13)

Performance improvements reflected in benchmarks:

L0 Core: ~596K tokens/sec
Full Stack: ~114K tokens/sec
Lower overhead percentages across most scenarios

Still comfortably above real-world model throughput.

7. Documentation Updates

README now includes Python performance section
BENCHMARKS.md updated with latest numbers
WHITEPAPER.md significantly expanded

🧭 Upgrade Notes

No breaking changes
Fully backward compatible
Strongly recommended if you:
- stream large outputs
- use guardrails heavily
- rely on drift detection
- run long-lived pipelines

Assets 2

03 Apr 19:11

LZL0

0.20.0

8c75f3d

🔧 L0-Python v0.20.0 - Structured Retry Fixes + Canonical Runtime Alignment

L0 Python v0.20.0 fixes structured retry behavior for stream factories, improves canonical lifecycle and observability alignment with the TypeScript runtime.

⚙️ 1. Structured Stream Factory Retry Fix

Fixed structured retry behavior so stream factory functions are called fresh on each retry attempt.
This resolves cases where structured retries could accidentally reuse an already-consumed stream, leading to failures like:
- locked/consumed stream errors
- invalid retry behavior in structured()
- invalid retry behavior in structured_array()
- fallback retry reuse issues
Factory-based structured flows now retry correctly across:
- normal retries
- fallback retries
- sync and async stream factories

📈 2. Canonical Lifecycle + Observability Alignment

Improved runtime parity with the canonical lifecycle and TypeScript event model.
Updates include:
- FALLBACK_START now uses fromIndex / toIndex
- retry attempts include isRetry
- fallback attempt numbering resets correctly per fallback stream
- error events now include richer recovery metadata
- failure classification and recovery strategy mapping are now emitted more explicitly
Added new canonical lifecycle and network classification tests to lock this behavior in.

🛟 3. Network Error Classification Coverage

Added broader canonical tests for network error detection and classification.
Improves confidence around handling of:
- connection drops
- DNS failures
- fetch/network request failures
- timeout conditions
- SSL-related failures
This strengthens parity between documented behavior and tested runtime behavior.

🗃️ 4. Documentation Updates

Updated docs across the project for correctness and consistency, including:
- API documentation
- lifecycle docs
- custom adapter docs
- multimodal docs
- consensus docs
- document window docs
- guardrails docs
- README usage fixes
Added a new WHITEPAPER.md describing L0 as a deterministic streaming execution substrate for AI.

Assets 2

13 Dec 23:58

LZL0

0.19.0

f62ebb6

🏎️ L0-Python 0.19.0 - Performance Improvements

This release introduces optimizations to our core drift detection logic and updates our event tracing system for better performance.

🚀 Performance Improvements

Drift detection has been significantly optimized by pre-compiling all regex patterns and removing repeated per-check compilation. This reduces overhead across tone, format, repetition, markdown, and hedging detection while preserving identical behavior. The changes are entirely internal but materially improve throughput under high-token streaming workloads.

🧭 Deterministic Callback IDs (UUIDv7)

Guardrail and observability callbacks now use UUIDv7-based IDs instead of UUIDv4. UUIDv7 is time-ordered and faster to generate, improving traceability and event ordering in high-concurrency and distributed systems while maintaining global uniqueness.

🔥 Benchmark Results

Test Environment

CPU: Apple M1 Max (10 cores)
Runtime: Python 3.13, pytest 9 with pytest-asyncio 1.3.0
Methodology: Mock token streams with zero inter-token delay to measure pure L0 overhead

Scenario	Tokens/s	Avg Duration	TTFT	Overhead
Baseline (raw streaming)	1,518,271	1.32 ms	0.02 ms	-
L0 Core (no features)	551,696	3.63 ms	0.08 ms	175%
L0 + JSON Guardrail	469,922	4.26 ms	0.07 ms	223%
L0 + All Guardrails	367,328	5.44 ms	0.08 ms	313%
L0 + Drift Detection	119,758	16.70 ms	0.08 ms	1166%
L0 Full Stack	108,257	18.48 ms	0.07 ms	1301%

📦 Installation

pip install ai2070-l0
# or
pip install ai2070-l0[openai]
pip install ai2070-l0[litellm]

Assets 2

10 Dec 14:07

LZL0

0.18.0

b5b77f1

🙀 L0-Python 0.18.0 - Full Pydantic Model Suite

This release delivers a complete Pydantic model export layer for every major L0 type.

✨ New: Full Pydantic Model Suite (`l0.pydantic`)

L0 now provides a complete Pydantic BaseModel mirror of every major internal dataclass.

You can now import Pydantic equivalents for:

Core types (StateModel, RetryModel, TimeoutModel, TelemetryModel, etc.)
Consensus models
Drift detection
Guardrails
Metrics snapshots
Parallel/race operations
Pipeline execution
Pool operations
Event sourcing + replay
Observability events
Windowing/document chunking

Example:

from l0.pydantic import StateModel, RetryModel, DriftResultModel

state = StateModel(content="hello", token_count=5)
json_data = state.model_dump_json()
schema = StateModel.model_json_schema()

This enables:

Typed JSON schemas for OpenAPI/SDKs
Runtime-safe structured logging
Interop with FastAPI / Litestar
Persisting structured observability events
Easier debugging & replay

📦 The new module contains over 1,500 lines of typed models, covering all L0 dataclasses.

📈 Benchmark Improvements

BENCHMARKS.md received several updates:

Updated environment to Python 3.13, pytest 9, and pytest-asyncio 1.3.0
Clarified methodology
Updated Nvidia Blackwell section
Added Python 3.14 performance note:
Pydantic import overhead currently impacts async iteration speed by ~30% in Python 3.14; this appears to be a Pydantic compatibility issue, not a Python regression
Updated instructions for running benchmarks (now explicitly using Python 3.13)

🧩 Summary of Changes

Area	Change
Pydantic Export Layer	Full Pydantic BaseModel suite for all L0 types
README	New Pydantic section + improvements
Benchmarks	Updated environment, performance notes, 3.14 caveats, commands
Events	Updated/expanded Pydantic event definitions
Testing	New comprehensive Pydantic model tests

🎯 Why This Matters

This release lays the foundation for:

Strong typing across every L0 subsystem
First-class OpenAPI / schema-driven integrations
Richer tooling: dashboards, telemetry pipelines, logging processors
Fully typed observability + replay pipelines
Easier internal and external adapter development

L0 now provides one of the most complete type-model sets in the Python AI ecosystem.

Assets 2

08 Dec 22:40

LZL0

0.17.0

fb08eda

🐍 Python v0.17.0 - High-Throughput Upgrade

The Python runtime for L0 receives the same performance-focused overhaul as the TypeScript version targeting Nvidia Blackwell support. This release introduces incremental JSON guardrails, sliding-window drift detection, new high-throughput defaults, and a brand-new benchmark suite demonstrating Python’s ability to sustain 120K+ tokens/sec.

This update includes major internal upgrades across guardrails and drift detection.

✨ Highlights

1. ⚡ Incremental JSON Guardrails (O(delta) cost)

json_rule() has been rewritten to match the new TS architecture:

New IncrementalJsonState dataclass
Tracks braces, brackets, string/escape state incrementally
Only processes delta (new characters), not full content
Full analyze_json_structure() executed only at stream completion
Automatic state reset on new/shortened streams

Result: ~5–10× faster per-token guardrail checks under streaming load.

2. 🎯 Sliding Window Drift Detection

DriftConfig now includes:

sliding_window_size: int = 500

Drift detection now:

Analyzes only the last N characters
Meta commentary, repetition, markdown collapse, tone shift all run on the window
Reduces drift-detection cost by O(content_length) → O(window_size)
Matches the TS implementation for cross-platform parity

3. 🚀 New High-Throughput Default Intervals

Python now uses the same optimized defaults as TS:

Interval	Old	New
Guardrails	5 tokens	15
Drift	10 tokens	25
Checkpoint	10 tokens	20

Updated in ADVANCED.md and CheckIntervals (src/l0/types.py).

4. 🧪 New Benchmark Suite (BENCHMARKS.md)

Full benchmarking added (99 additions):

Baseline vs core vs guardrails vs drift vs full-stack
Measured on Apple M1 Max with Python 3.13
Python achieves 1.5M tokens/sec raw iteration and 120K TPS full-stack with all guardrails enabled
Ready for 1000+ TPS Nvidia Blackwell inference loads

Benchmarks include reproducible pytest commands.

🗑️ Targeted Deletions / Optimization Removals

Removed old full-content drift detection paths
Removed malformed-pattern reporting in streaming phase (now done incrementally)
Removed obsolete default interval values (5/10/10)
Removed non-window-based drift comparisons to last full content

Assets 2

08 Dec 04:19

LZL0

0.16.0

9a7d83c

L0 for Python - Initial Release (Full Lifecycle + Event Compatibility)

This is the first release of L0 for Python, the deterministic execution substrate for reliable AI streaming - now with full lifecycle parity and event-type compatibility with the TypeScript implementation.

L0 provides the missing reliability layer for all AI streams: deterministic token delivery, retries, fallbacks, guardrails, drift detection, checkpoint resumption, network protection, and full observability - all transparently wrapped around any LLM provider stream.

This release is built for production workloads and ships with 1,800+ tests, real adapter integrations for OpenAI and LiteLLM (100+ providers), and a fully instrumented streaming runtime covering 25+ structured lifecycle events.

🔥 Key Highlights

✅ Full Lifecycle Compatibility

The Python version now includes the complete deterministic lifecycle flow - retries, fallbacks, checkpoints, resume logic, guardrail phases, drift detection, tool-call phases, and completion flow identical in semantics to the TypeScript implementation.
All lifecycle callbacks (on_start, on_event, on_violation, on_retry, on_fallback, on_resume, on_timeout, etc.) are implemented and follow the same event order and guarantees.

🎛️ Central Event Bus with 25+ Structured Event Types

This release introduces the full observability and event-sourcing infrastructure:

SESSION_START, STREAM_INIT, ADAPTER_DETECTED
TIMEOUT_*, RETRY_*, FALLBACK_*
GUARDRAIL_*, DRIFT_*, CHECKPOINT_SAVED
TOOL_REQUESTED, TOOL_RESULT, TOOL_ERROR
SESSION_SUMMARY & SESSION_END

These events enable complete introspection, replay, debugging, supervision, and telemetry in production systems.

⚡ Deterministic Streaming Runtime

Token-by-token normalization
Timeout enforcement (initial + inter-token)
Checkpointing and last-known-good-token resumption
Drift detection & pattern-based guardrails
Network protection across 12+ failure patterns

🔁 Smart Retries & Fallbacks

Distinguishes model errors from network/transient errors
Sequential fallback chain with on_fallback telemetry
AWS-style fixed-jitter backoff by default
Full retry/fallback reasoning surfaced through lifecycle events

🧱 Structured Output with Automatic Repair

Native Pydantic integration
Corrects malformed JSON (missing braces, broken fences, trailing commas)
Guaranteed schema validity

🔌 Adapters

OpenAI adapter (auto-detected)
LiteLLM adapter (100+ providers)
Full API-compatible adapter protocol for custom providers

🧪 Battle-Tested

1,800+ unit tests
100+ integration tests simulating real streaming conditions

📦 Installation

pip install ai2070-l0
# or
pip install ai2070-l0[openai]
pip install ai2070-l0[litellm]

🏁 Quick Example

import asyncio
from openai import AsyncOpenAI
import l0

async def main():
    client = l0.wrap(AsyncOpenAI())

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
        stream=True,
    )

    async for event in response:
        if event.is_token:
            print(event.text, end="", flush=True)

asyncio.run(main())

Assets 2

Releases: ai-2070/l0-python

🏁 L0 (Python) v0.21.0 - Streaming Performance Overhaul, Guardrail Optimizations, and Drift Efficiency

✨ Highlights

1. O(n) Token Accumulation (Major Performance Fix)

2. Drift Detection: Sliding Window + Bounded Memory

3. Guardrails: Significant Runtime Optimizations

JSON Guardrail

Markdown Guardrail

Pattern Guardrail (Major Change)

4. Runtime Hot Path Optimizations

Callback Execution

Observability Events

Buffer Reset Fixes

5. Improved Checkpoint + State Handling

6. Updated Benchmarks (Python 3.13)

7. Documentation Updates

🧭 Upgrade Notes

Uh oh!

🔧 L0-Python v0.20.0 - Structured Retry Fixes + Canonical Runtime Alignment

⚙️ 1. Structured Stream Factory Retry Fix

📈 2. Canonical Lifecycle + Observability Alignment

🛟 3. Network Error Classification Coverage

🗃️ 4. Documentation Updates

Uh oh!

🏎️ L0-Python 0.19.0 - Performance Improvements

🚀 Performance Improvements

🧭 Deterministic Callback IDs (UUIDv7)

🔥 Benchmark Results

📦 Installation

Uh oh!

🙀 L0-Python 0.18.0 - Full Pydantic Model Suite

✨ New: Full Pydantic Model Suite (l0.pydantic)

📈 Benchmark Improvements

🧩 Summary of Changes

🎯 Why This Matters

Uh oh!

🐍 Python v0.17.0 - High-Throughput Upgrade

✨ Highlights

1. ⚡ Incremental JSON Guardrails (O(delta) cost)

2. 🎯 Sliding Window Drift Detection

3. 🚀 New High-Throughput Default Intervals

4. 🧪 New Benchmark Suite (BENCHMARKS.md)

🗑️ Targeted Deletions / Optimization Removals

Uh oh!

L0 for Python - Initial Release (Full Lifecycle + Event Compatibility)

🔥 Key Highlights

✅ Full Lifecycle Compatibility

🎛️ Central Event Bus with 25+ Structured Event Types

⚡ Deterministic Streaming Runtime

🔁 Smart Retries & Fallbacks

🧱 Structured Output with Automatic Repair

🔌 Adapters

🧪 Battle-Tested

📦 Installation

🏁 Quick Example

Uh oh!

✨ New: Full Pydantic Model Suite (`l0.pydantic`)