You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GroundingDINO Object Detection: Open-vocabulary detector (grounding-dino-tiny) on RTX 5080. Per-object bounding boxes for accurate positioning.
D435 Camera Mounted on Go2 Head: MJCF named cameras (d435_rgb, d435_depth) fixed to base_link. No more free-camera approximation.
Per-Object Depth Projection: Each detected object gets independent world (x,y,z) from bbox center depth + camera intrinsics + robot pose. ~0.8m accuracy.
VLM Timeout Fix: Resize to 160px + quality 50, switch to gpt-4o-mini. 1-2s response (was timing out at 45s).
Vector Nav Stack: localPlanner, pathFollower, terrainAnalysis, FAR planner
TARE autonomous exploration with seed walk + nav flag handoff
GroundingDINO object detection → D435 depth → world coordinate mapping
VLM room identification + scene description (GPT-4o-mini)
Multi-room patrol with spatial memory recording
Agent SDK: natural language → Go2 skills (12 skills, Chinese + English)
SceneGraph persists across sessions
RViz: color-coded rooms, FOV cones, trajectory, objects at detected positions
Known Issues
Some false detections in MuJoCo (low texture quality → floor detected as desk)
Objects at image edges may have inaccurate depth (D435 depth noise at boundaries)
FAR planner publishes /way_point but /global_path sometimes incomplete
BUG-3 FIXED: TARE viewpoints were 0.18m from walls, Go2 half-width=0.25m — unreachable.
Collision margins updated. Exploration now places reachable viewpoints 0.30m+ from walls.
BUG-1 FIXED: RGB/depth timing mismatch. explore.py and look.py now capture all sensor
data (frame, depth, cam_pose, pos, heading) atomically before slow VLM calls (2-20s).
go2_vnav_bridge.py now uses named d435_rgb/d435_depth cameras instead of free camera.
BUG-4 FIXED: Path follower hard-dropped to 0.1 m/s when dist<0.5m. With TARE replanning
at ~1Hz with short paths, dog was permanently slow. Now: adaptive lookahead (0.8-2.0m
scaled to path length), gradual decel only at dist<0.3m, floor max(0.15, dist*0.5).
TODO
Filter false detections: cross-reference GroundingDINO with VLM objects
Multi-view object fusion: merge detections from different viewpoints
Real D435 driver integration (rs2 → /camera/image + /camera/depth)
Confidence-weighted object position averaging across observations
Object tracking across frames (re-identification)
BUG-2 (wall clipping): Confirm via RViz that /terrain_map marks walls as obstacles post BUG-3 fix.
obstacleHeightThre=0.08 in local_planner_go2.yaml should catch wall points (h>0.08m).
Likely resolves after BUG-3 since TARE no longer requests near-wall paths.