Skip to content

[BUG] Unknown boot script serve attempts crash #64

@rainest

Description

@rainest

Describe the bug
Attempts to serve the unknown machine boot script attempt to access etcd, even in SQL mode. If etcd is not available, this (and likely most other etcd access attempts) result in a API handler goroutine panic and an empty API response.

To Reproduce

Add a machine without a ComponentEndpoint and associated boot script:

ochami default smd iface add x1000c0s0b0n0 bc:24:11:40:03:a3 internal,192.168.8.128
ochami default smd component add x1000c0s0b0n0 1

ochami  bss boot params set --mac bc:24:11:40:03:a3 --initrd=http://192.168.8.12/images/alma/initramfs --kernel=http://192.168.8.12/images/alma/vmlinuz
ochami  bss boot image set --mac bc:24:11:40:03:a3  live:http://192.168.8.12/images/alma/alma.sqsh

Attempt to retrieve this machine's boothscript with some architecture:

ochami bss boot script get --mac bc:24:11:40:03:a3 --arch i386

You can probably also just use an entirely unrecognized MAC and trigger the same result, but I didn't try.

Observed behavior

This machine pulls a matching bootscript by MAC in BSS lookup code, but not a matching ComponentEndpoint. Lack of a component endpoint satisfies the unknown condition and executes unknownBootScrtipt(), which fails when it attempts to retrieve state:

Panic stack trace
2025/04/25 21:21:21 http: panic serving 10.42.1.185:57042: runtime error: slice bounds out of range [-1:]
goroutine 434 [running]:
net/http.(*conn).serve.func1()
	/opt/hostedtoolcache/go/1.23.4/x64/src/net/http/server.go:1947 +0xbe
panic({0x110df20?, 0xc000644318?})
	/opt/hostedtoolcache/go/1.23.4/x64/src/runtime/panic.go:785 +0x132
github.com/go-chi/chi/middleware.prettyStack.decorateFuncCallLine({}, {0xc0005e6283, 0x1f}, 0x1, 0x8)
	/home/runner/go/pkg/mod/github.com/go-chi/chi@v1.5.5/middleware/recoverer.go:130 +0x525
github.com/go-chi/chi/middleware.prettyStack.decorateLine({}, {0xc0005e6283?, 0x12b5?}, 0x1, 0x8)
	/home/runner/go/pkg/mod/github.com/go-chi/chi@v1.5.5/middleware/recoverer.go:106 +0x154
github.com/go-chi/chi/middleware.prettyStack.parse({}, {0xc000516000, 0x12b5, 0xc000778518?}, {0x102bec0, 0x1c836b0})
	/home/runner/go/pkg/mod/github.com/go-chi/chi@v1.5.5/middleware/recoverer.go:89 +0x3bc
github.com/go-chi/chi/middleware.PrintPrettyStack({0x102bec0, 0x1c836b0})
	/home/runner/go/pkg/mod/github.com/go-chi/chi@v1.5.5/middleware/recoverer.go:46 +0x3b
github.com/go-chi/chi/middleware.(*defaultLogEntry).Panic(0x439c05?, {0x102bec0?, 0x1c836b0?}, {0x439c40?, 0xc0004a96c0?, 0xc000778660?})
	/home/runner/go/pkg/mod/github.com/go-chi/chi@v1.5.5/middleware/logger.go:165 +0x25
github.com/go-chi/chi/middleware.Recoverer.func1.1()
	/home/runner/go/pkg/mod/github.com/go-chi/chi@v1.5.5/middleware/recoverer.go:28 +0xc8
panic({0x102bec0?, 0x1c836b0?})
	/opt/hostedtoolcache/go/1.23.4/x64/src/runtime/panic.go:785 +0x132
main.checkState(0x0)
	/home/runner/work/bss/bss/cmd/boot-script-service/scn.go:229 +0x50
main.unknownBootScript({0xc0000ea1bd, 0x4}, {0xc000039248, 0x11}, {0xc0000f0130, 0xd}, 0xfa4000, 0x680bfcc7, {0xc0000f0149, 0x7}, ...)
	/home/runner/work/bss/bss/cmd/boot-script-service/default_api.go:732 +0x3c5
main.BootscriptGet({0x7fcc31b3cc48, 0xc000018880}, 0xc0008df7c0)
	/home/runner/work/bss/bss/cmd/boot-script-service/default_api.go:864 +0xdd3
main.bootScript({0x7fcc31b3cc48?, 0xc000018880?}, 0x0?)
	/home/runner/work/bss/bss/cmd/boot-script-service/routers.go:152 +0xf0

I don't fully understand this panic--I think the indicated issue is actually in the Chi recover logic (which I care less about--it's a Chi bug, but realistically there's not much Chi can do to help if our API code dies) and the original BSS panic is masked.

Expected behavior
BSS serves some boot script appropriate to unknown machines, without crashing.

Additional context

The access attempt at https://github.com/OpenCHAMI/bss/blob/main/cmd/boot-script-service/scn.go#L229 is also the line that panics.

Best (strong) guess is that kvstore is uninitialized and the panic occurs because we effectively call <null>.method() anywhere we try to use it.

kvstore is a package-global struct. It's assigned to an actual struct in the kvOpen() function, which only runs in the useSQL == false case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions