11 Mar 15:32

github-actions

bcf7496

v2.3.0 Latest

Latest

2.3.0 (2026-03-11)

Bug Fixes

add --break-system-packages for pip installs + pip.conf bypass PEP 668 (14430c4)
allow clippy too_many_arguments for run_task_pipeline (6eb69c2)
auto-install deps, python3 symlink, detect full commands in fail_to_pass, language-aware test scripts (a38497f)
config test race condition with env var mutex (2963325)
correct Basilica API types and SSH key support (63d8174)
enable apt/sudo in Basilica containers (d83cb8c)
expose agent_output and agent_patch in TaskResult and API responses (348c251)
extract_agent_only for /evaluate - no tasks/ dir required (2b90ee1)
filter out apt-get/system commands from install (Basilica blocks syscalls), keep project-level installs (e5365da)
full clone for commit checkout, explicit pip/pytest symlinks (a0c1d6f)
handle null test_patch from HuggingFace API (deserialize null as empty string) (492d068)
increase clone/install timeout from 180s to 600s (95cecc3)
install base tools, runtimes, and filter redundant deps for Basilica (80a3a0c)
install corepack/yarn/pnpm globally via npm in Dockerfile (b7183e8)
move workspace to /home/agent/sessions, fix node_modules permissions, improve agent code error handling (1ced355)
normalize repo URL in parse_task (add github.com prefix) (398a6fd)
pip 22 compatibility for base tools and install commands (68bb93f)
remove redundant into_iter() for clippy (eaf2a7c)
report task status incrementally during batch execution (4440fd8)
resolve all clippy warnings for CI (2b3ae9d)
revert Dockerfile git-lfs changes, add GIT_LFS_SKIP_SMUDGE to snapshot clone (7130823)
run agent from repo_dir CWD, use absolute path to agent.py (cc6bcde)
run as root (Basilica blocks sudo), remove sudo prefix logic (477a433)
sudo for apt-get in install commands, add golang/corepack/sudo to Dockerfile (1aceb88)
upgrade Go to 1.23 and Node to 20 LTS in Dockerfile (67ca713)
use :id path params for Axum 0.7 (not {id} which is 0.8) (5dfa0c1)

Features

/evaluate endpoint using stored agent + TRUSTED_VALIDATORS whitelist (b6aee7a)
add /code-hash endpoint for code integrity verification (0a8e01b)
add /upload-agent-json endpoint for JSON-based agent upload (9cfa1da)
add Basilica API client for container provisioning (8a0afca)
add install field from swe-forge dataset, fix default split to train, add openssh-client (737ab1f)
add POST /submit_tasks endpoint + fix HuggingFace dataset compat (d92444c)
agent user with sudo for apt-install, run all commands as non-root agent (e3f574a)
agent ZIP upload frontend with env vars + SUDO_PASSWORD auth (3aa5184)
auto-install language runtimes from install_config version fields (25b2e94)
change default max_concurrent_tasks from 8 to 6, support CONCURRENTLY_TASKS env var (eaba581)
extract full agent project instead of concatenating files (3ac1023)
fat Docker image with all language runtimes (java, rust, pnpm, unzip, etc.) (3855f2d)
fetch task definitions from HF repo (workspace.yaml + tests/), remove auto_install hack (7162a39)
propagate agent_env to run_agent and pass --instruction arg to Python agents (d922264)
replace per-file HF downloads with bulk git clone snapshot (6036b78)
run each task in its own Basilica container via SSH (432107b)
swe-bench/swe-forge integration - extend WorkspaceConfig with fail_to_pass/pass_to_pass/install_config/difficulty fields - parse swe-forge workspace.yaml native fields as test script fallback - capture git diff (agent patch) after agent execution - add /dataset endpoint to fetch from HuggingFace CortexLM/swe-forge - wire fail_to_pass/pass_to_pass in dataset entry conversion (814259e)

Assets 2

20 Feb 22:07

github-actions

v2.2.0

dac8c4a

v2.2.0

2.2.0 (2026-02-20)

Features

evaluation: add evaluation module using platform-challenge-sdk types (#6) (78a369e)

Assets 2

20 Feb 21:39

github-actions

v2.1.0

bfceaeb

v2.1.0

2.1.0 (2026-02-20)

Features

integrate HuggingFace dataset handler with task/evaluation system (db3ba95)

Assets 2

18 Feb 17:23

github-actions

v2.0.0

f6d165c

v2.0.0

2.0.0 (2026-02-18)

Features

auth: replace static hotkey/API-key auth with Bittensor validator whitelisting and 50% consensus (#5) (a573ad0)

BREAKING CHANGES

auth: WORKER_API_KEY env var and X-Api-Key header no longer required.
All validators on Bittensor netuid 100 with sufficient stake are auto-whitelisted.
ci: trigger CI run
fix(security): address auth bypass, input validation, and config issues

Move nonce consumption AFTER signature verification in verify_request()
to prevent attackers from burning legitimate nonces via invalid signatures
Fix TOCTOU race in NonceStore::check_and_insert() using atomic DashMap
entry API instead of separate contains_key + insert
Add input length limits for auth headers (hotkey 128B, nonce 256B,
signature 256B) to prevent memory exhaustion via oversized values
Add consensus_threshold validation in Config::from_env() — must be
in range (0.0, 1.0], panics at startup if invalid
Add saturating conversion for consensus required calculation to prevent
integer overflow on f64→usize cast
Add tests for all security fixes

fix(dead-code): remove orphaned default_concurrent fn and unnecessary allow(dead_code)
fix: code quality issues in bittensor validator consensus

Extract magic number 100 to configurable MAX_PENDING_CONSENSUS
Restore #[allow(dead_code)] on DEFAULT_MAX_OUTPUT_BYTES constant
Use anyhow::Context instead of map_err(anyhow::anyhow!) in validator_whitelist

fix(security): address race condition, config panic, SS58 checksum, and container security

consensus.rs: Fix TOCTOU race condition in record_vote by using
DashMap entry API (remove_entry) to atomically check votes and remove
entry while holding the shard lock, preventing concurrent threads from
inserting votes between drop and remove
config.rs: Replace assert! with proper Result<Self, String> return
from Config::from_env() to avoid panicking in production on invalid
CONSENSUS_THRESHOLD values
main.rs: Update Config::from_env() call to handle Result with expect
auth.rs: Add SS58 checksum verification using Blake2b-512 (correct
Substrate algorithm) in ss58_to_public_key_bytes to reject addresses
with corrupted checksums; previously only decoded base58 without
validating the 2-byte checksum suffix
Dockerfile: Add non-root executor user for container runtime security

fix(dead-code): remove unused max_output_bytes config field and constant

Remove DEFAULT_MAX_OUTPUT_BYTES constant and max_output_bytes Config field
that were defined and populated from env but never read anywhere outside
config.rs. Both had #[allow(dead_code)] annotations suppressing warnings.

fix(quality): replace expect/unwrap with proper error handling, extract magic numbers to constants

main.rs: Replace .expect() on Config::from_env() with match + tracing::error! + process::exit(1)
validator_whitelist.rs: Extract retry count (3) and backoff base (2) to named constants
validator_whitelist.rs: Replace unwrap_or_else on Option with if-let pattern
consensus.rs: Extract reaper interval (30s) to REAPER_INTERVAL_SECS constant

fix(security): address multiple security vulnerabilities in PR files

consensus.rs: Remove archive_data storage from PendingConsensus to
prevent memory exhaustion (up to 50GB with 100 pending × 500MB each).
Callers now use their own archive bytes since all votes for the same
hash have identical data.
handlers.rs: Stream multipart upload with per-chunk size enforcement
instead of buffering entire archive before checking size limit.
Sanitize error messages to not leak internal details (file paths,
extraction errors) to clients; log details server-side instead.
auth.rs: Add nonce format validation requiring non-empty printable
ASCII characters (defense-in-depth against log injection and empty
nonce edge cases).
main.rs: Replace .unwrap() on TcpListener::bind and axum::serve with
proper error logging and process::exit per AGENTS.md rules.
ws.rs: Replace .unwrap() on serde_json::to_string with
unwrap_or_default() to comply with AGENTS.md no-unwrap rule.

fix(dead-code): rename misleading underscore-prefixed variable in consensus
fix(quality): replace unwrap/expect with proper error handling in production code

main.rs:21: Replace .parse().unwrap() on tracing directive with
unwrap_or_else fallback to INFO level directive
main.rs:36: Replace .expect() on workspace dir creation with
error log + process::exit(1) pattern
main.rs:110: Replace .expect() on ctrl_c handler with if-let-Err
that logs and returns gracefully
executor.rs:189: Replace semaphore.acquire().unwrap() with match
that handles closed semaphore by creating a failed TaskResult

All changes follow AGENTS.md rule: no .unwrap()/.expect() in
production code paths. Test code is unchanged.

docs: refresh AGENTS.md

Assets 2

17 Feb 16:46

github-actions

v1.2.0

3e45f6e

v1.2.0

1.2.0 (2026-02-17)

Features

auth: add sr25519 signature + nonce verification (dc8d8d4)
auth: require API key alongside whitelisted hotkey (#3) (887f72b)

Assets 2

17 Feb 16:01

github-actions

v1.1.0

4cebf27

v1.1.0

1.1.0 (2026-02-17)

Features

executor: add SWE-bench batch evaluation with hotkey auth and WebSocket streaming (#2) (8bfa8ee)

Assets 2

17 Feb 15:54

github-actions

v1.0.0

0c403e8

v1.0.0

1.0.0 (2026-02-17)

Bug Fixes

bump Rust Docker image to 1.85 for edition2024 support (209f460)
lowercase GHCR image tags for Docker push (89449f9)
remove target-cpu=native to avoid SIGILL on Blacksmith runners (22bcb85)
use rust:1.93-bookworm Docker image (ddd1a24)

Features

initial term-executor — remote evaluation server for Basilica (18f4f67)
production-ready implementation with Basilica integration (5797025)

Performance Improvements

minimal Docker image - remove all language runtimes from executor (38058e8)

Assets 2

Releases: PlatformNetwork/term-executor

v2.3.0

2.3.0 (2026-03-11)

Bug Fixes

Features

Uh oh!

v2.2.0

2.2.0 (2026-02-20)

Features

Uh oh!

v2.1.0

2.1.0 (2026-02-20)

Features

Uh oh!

v2.0.0

2.0.0 (2026-02-18)

Features

BREAKING CHANGES

Uh oh!

v1.2.0

1.2.0 (2026-02-17)

Features

Uh oh!

v1.1.0

1.1.0 (2026-02-17)

Features

Uh oh!

v1.0.0

1.0.0 (2026-02-17)

Bug Fixes

Features

Performance Improvements

Uh oh!