-
Notifications
You must be signed in to change notification settings - Fork 279
Description
Agent Diagnostic
- Investigated gateway startup failure on Jetson Orin (tegra-ubuntu, L4T).
- Context discovered: On Orin, veth is missing or unusable, so Docker cannot bring up the gateway with default bridge/NAT. This repo uses host network (
network_mode: host) for the gateway to work around that. After using host network, the iptables incompatibility appears. - Root cause: Host kernel uses iptables-legacy tables (L4T / Ubuntu 20.04). The gateway cluster image (Ubuntu 24.04–based) ships iptables-nft as default. kube-proxy inside the container runs in the same kernel (host network) and invokes the container’s iptables (nft); it cannot read the host’s legacy nat table → "Could not fetch rule set generation id: Invalid argument", kube-proxy exits, k3s shuts down.
- Skills / checks performed:
- Confirmed failure pattern in container logs:
kube-proxy exited: iptables is not available on this host,# Warning: iptables-legacy tables present, use iptables-legacy to see them,iptables v1.8.10 (nf_tables): Could not fetch rule set generation id. - Verified on host:
update-alternatives --display iptablesshows link currently points to/usr/sbin/iptables-legacy. - Tried switching host to iptables-nft to align with container: Docker then fails when creating the gateway network with the same iptables error (Docker daemon calls iptables for NAT; kernel nat table remains legacy on L4T, so nft still fails). Conclusion: on L4T the host must stay on iptables-legacy; only the container should use legacy.
- Confirmed failure pattern in container logs:
- Mitigations implemented in our fork: (1) Add
--prefer-bundled-binto k3s server args in CLI so k3s uses bundled iptables. (2) In cluster image entrypoint, beforeexec k3s, runupdate-alternatives --set iptables /usr/sbin/iptables-legacy(and ip6tables) when available. (3) Add iptables-specific failure diagnosis so the error is not misreported as "Network connectivity issue".
Description
What happened:
On Jetson Orin (tegra-ubuntu, L4T), openshell gateway start --name nemoclaw --port 30051 fails. The gateway container starts then exits. Container logs show kube-proxy exiting with: iptables is not available on this host : error listing chain "POSTROUTING" in table "nat": exit status 4: # Warning: iptables-legacy tables present, use iptables-legacy to see them and iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument. The CLI then reports "K8s namespace not ready" and "kube-proxy failed: iptables incompatible with host (e.g. Jetson L4T)".
What we expected:
Gateway should start and remain running on Jetson L4T when using host network (as required there due to missing veth / Docker NAT limitations). kube-proxy should either use a compatible iptables backend or the default cluster image should switch to iptables-legacy inside the container on such hosts.
Why it matters:
Jetson Orin / L4T is a supported deployment target; many users need to run the gateway with host network. Without a fix in upstream OpenShell (e.g. entrypoint switching to iptables-legacy when present, or --prefer-bundled-bin for k3s), users must rebuild the cluster image or the CLI from a fork.
Reproduction Steps
- Set up a Jetson Orin (or similar L4T device) with Docker, e.g.
nvidia@tegra-ubuntu. - Ensure host uses iptables-legacy (default on L4T):
sudo update-alternatives --display iptables→ link currently points to/usr/sbin/iptables-legacy. - Install OpenShell CLI (e.g. from offline bundle or build).
- Run:
openshell gateway start --name nemoclaw --port 30051 - Observe: "✓ Checking Docker", "✓ Downloading gateway", "x Initializing environment", then "Gateway failed: nemoclaw" with message "kube-proxy failed: iptables incompatible with host (e.g. Jetson L4T)" and container logs containing the iptables-legacy / nf_tables error above.
- (Optional) If host is switched to iptables-nft, reproduce Docker network failure: "failed to create Docker network" with "Failed to Setup IP tables ... Could not fetch rule set generation id", so host must remain on legacy.
Environment
- OS: drive Orin, tegra-ubuntu (L4T; Ubuntu 20.04–based)
- Docker: Docker Engine on L4T (version as reported by
docker versionon device) - OpenShell: v0.x.x (output of
openshell --versionfrom the build or offline bundle used) - Network: Gateway configured to use host network (port 30051) due to missing veth / Docker NAT limitations on this platform.
- iptables (host):
update-alternatives→ iptables currently points to/usr/sbin/iptables-legacy; kernel nat table is legacy. Container image (cluster) provides iptables-nft by default. - Relevant logs: Container stderr shows kube-proxy exit with
iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argumentand "iptables-legacy tables present".
Logs
nvidia@tegra-ubuntu:~/nemoclaw-offline-20260319$ ~/.local/bin/openshell gateway start --name nemoclaw --port 30051
✓ Checking Docker
✓ Downloading gateway
x Initializing environment x Gateway failed: nemoclaw
Network connectivity issue
Could not reach the container registry. This could be a DNS resolution failure, firewall blocking the connection, or general internet connectivity issue.
To fix:
1. Check your internet connection
2. Test DNS resolution
nslookup ghcr.io
3. Test registry connectivity
curl -I https://ghcr.io/v2/
4. If behind a corporate firewall/proxy, ensure Docker is configured to use it
5. Restart Docker and try again
Error: × K8s namespace not ready
╰─▶ gateway container is not running while waiting for namespace 'openshell': container exited (status=EXITED, exit_code=0)Agent-First Checklist
- I pointed my agent at the repo and had it investigate this issue
- I loaded relevant skills (e.g.,
debug-openshell-cluster,debug-inference,openshell-cli) - My agent could not resolve this — the diagnostic above explains why