Skip to content

fix: relabel qwen3.5:9b 2026-04-07 as BrowseComp (was mislabeled SimpleQA)#12

Merged
LearningCircuit merged 1 commit intomainfrom
fix-browsecomp-mislabel
Apr 9, 2026
Merged

fix: relabel qwen3.5:9b 2026-04-07 as BrowseComp (was mislabeled SimpleQA)#12
LearningCircuit merged 1 commit intomainfrom
fix-browsecomp-mislabel

Conversation

@LearningCircuit
Copy link
Copy Markdown
Owner

Summary

The qwen3.5:9b 2026-04-07 submission (merged as PR #10) was actually a BrowseComp run, not SimpleQA. The LDR exporter had a second bug: `dataset: SimpleQA` was hard-coded regardless of the actual benchmark.

This PR:

  • Changes `dataset: SimpleQA` → `dataset: BrowseComp`
  • Moves from `results/simpleqa/` to `results/browsecomp/langgraph-agent/serper/`

The exporter fix is in LearningCircuit/local-deep-research#3442.

…ed SimpleQA)

The LDR YAML exporter hard-coded dataset as "SimpleQA". This run was
actually xbench_deepsearch. Move to results/xbench-deepsearch/ and fix
the dataset field.

LDR exporter fix: LearningCircuit/local-deep-research#3442
@LearningCircuit LearningCircuit force-pushed the fix-browsecomp-mislabel branch from c9df301 to 00aef39 Compare April 9, 2026 18:40
@LearningCircuit LearningCircuit merged commit b522a98 into main Apr 9, 2026
9 checks passed
LearningCircuit added a commit that referenced this pull request Apr 10, 2026
peter-evans/create-pull-request restores the workspace to the base
branch HEAD after creating its PR, reverting the freshly rebuilt
leaderboards/CONTRIBUTORS files in the working tree. Running HF sync
after this step uploaded the OLD main state, causing the HF dataset
to silently lag behind the actual repo state.

Symptom: after PR #12 (xbench relabel) merged, the publish workflow
reported "Sync CSVs + README to Hugging Face: success" with the HF
API responding "No files have been modified since last commit" — the
workspace files at sync time matched HF because peter-evans had
already restored them to main HEAD (which still had the pre-rebuild
SimpleQA mislabel).

Fix: move HF sync BEFORE the create-pull-request step so it operates
on the freshly rebuilt files.
LearningCircuit added a commit that referenced this pull request Apr 10, 2026
peter-evans/create-pull-request restores the workspace to the base
branch HEAD after creating its PR, reverting the freshly rebuilt
leaderboards/CONTRIBUTORS files in the working tree. Running HF sync
after this step uploaded the OLD main state, causing the HF dataset
to silently lag behind the actual repo state.

Symptom: after PR #12 (xbench relabel) merged, the publish workflow
reported "Sync CSVs + README to Hugging Face: success" with the HF
API responding "No files have been modified since last commit" — the
workspace files at sync time matched HF because peter-evans had
already restored them to main HEAD (which still had the pre-rebuild
SimpleQA mislabel).

Fix: move HF sync BEFORE the create-pull-request step so it operates
on the freshly rebuilt files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant