Skip to content

Pull requests: UKGovernmentBEIS/inspect_ai

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

DRAFT: control channel
#4082 opened May 29, 2026 by ransomr Collaborator Draft
5 tasks
scoring-phase resume via solver_done marker
#4078 opened May 28, 2026 by epatey Collaborator Draft
3 of 5 tasks
docs(scorer): clarify chat_history comment for include_history=True
#4073 opened May 28, 2026 by RecreationalMath Contributor Loading…
1 of 5 tasks
docs: clarify include_history grading behavior
#4072 opened May 28, 2026 by he-yufeng Contributor Loading…
Bound transcript memory for long-running samples qualified
#4062 opened May 27, 2026 by rasmusfaber Contributor Loading…
3 of 5 tasks
Add OrcaRouter model provider
#4057 opened May 27, 2026 by xilema2 Loading…
2 of 5 tasks
vllm-completions: accept pre-tokenized prompts
#4055 opened May 26, 2026 by Butanium Contributor Loading…
test: cover eval_set bundling after retries
#4052 opened May 26, 2026 by deepujain Contributor Draft
1 of 6 tasks
fix: recreate sandbox context during scoring
#4051 opened May 26, 2026 by deepujain Contributor Draft
1 of 5 tasks
fix: warn on unbounded message conversations
#4049 opened May 26, 2026 by deepujain Contributor Draft
2 of 5 tasks
fix: align vLLM logging with Inspect log level
#4047 opened May 26, 2026 by deepujain Contributor Draft
2 of 5 tasks
feat: add OpenRouter app attribution headers
#4046 opened May 26, 2026 by deepujain Contributor Draft
fix: resolve registry task sandbox paths from module dir
#4045 opened May 26, 2026 by deepujain Contributor Draft
1 of 5 tasks
fix: support explicit Azure OpenAI AD token auth
#4044 opened May 26, 2026 by deepujain Contributor Draft
2 of 5 tasks
fix: map metrics to solver score names
#4043 opened May 26, 2026 by deepujain Contributor Draft
2 of 5 tasks
docs: clarify ReAct loop depth controls
#4042 opened May 26, 2026 by deepujain Contributor Draft
1 of 5 tasks
docs: add LLM-to-LLM conversation eval example
#4041 opened May 26, 2026 by deepujain Contributor Draft
1 of 5 tasks
Add Krippendorff's α metric for multi-judge agreement
#4035 opened May 25, 2026 by joesposito8 Contributor Loading…
2 of 5 tasks
ProTip! Adding no:label will show everything without a label.