Skip to content

Conversation

@cgwalters
Copy link
Collaborator

Instead of using systemd.volatile=overlay which overlaid all of / with a single tmpfs-backed overlayfs, set up /etc and /var separately:

  • /etc: overlayfs with tmpfs upper (transient changes, lost on reboot)
  • /var: real tmpfs with content copied from image (not overlayfs)

The key benefit is that /var is now a real tmpfs, allowing podman to use overlayfs for container storage inside /var/lib/containers. With the old approach, the nested overlayfs caused "too many levels of symbolic links" errors.

Implementation uses systemd credentials to inject units that run in the initramfs before switch-root:

  • sysroot-etc.mount: overlay on /sysroot/etc
  • bcvk-var-ephemeral.service: copies /sysroot/var to tmpfs and bind mounts

Both units use ConditionPathExists=/etc/initrd-release to only run in the initramfs context.

This is Phase 1 of issue #22, making ephemeral VMs more bootc-like. SELinux is still disabled (selinux=0); Phase 2 will add composefs support to enable proper SELinux labeling.

xref: #22 (Phase 1)
Assisted-by: OpenCode (Sonnet 4)

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the broad systemd.volatile=overlay with more granular, fine-grained tmpfs and overlayfs mounts for /var and /etc respectively. This is a well-reasoned change that solves a real-world issue with nested overlayfs when using podman inside the ephemeral VM. The implementation using systemd credentials injected via SMBIOS is clean and effective. The addition of an integration test to verify the new mount layout is excellent. My feedback is minor, focusing on improving code consistency and fixing a small typo in a comment.

CentOS Stream 9 has systemd 252, but the systemd.extra-unit.* and
systemd.unit-dropin.* credential types require systemd 256+.

Add a generator shim script that runs in the initramfs and processes
these credential types for older systemd versions. The shim:
- Reads credentials from $CREDENTIALS_DIRECTORY
- Writes systemd.extra-unit.* as unit files
- Writes systemd.unit-dropin.* as drop-in configs
- No-ops on systemd 256+ where systemd-debug-generator handles this

The shim is injected into the initramfs by appending a CPIO archive.
This uses the Linux kernel's ability to concatenate multiple CPIO
archives - the kernel processes them sequentially.

Implementation:
- New cpio.rs module for minimal CPIO newc format writing
- bcvk-credential-generator script included via include_str!
- run_ephemeral.rs copies (instead of bind mounts) initramfs and
  appends the generator CPIO

Assisted-by: OpenCode (Sonnet 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Instead of using systemd.volatile=overlay which overlaid all of / with
a single tmpfs-backed overlayfs, set up /etc and /var separately:

- /etc: overlayfs with tmpfs upper (transient changes, lost on reboot)
- /var: real tmpfs with content copied from image (not overlayfs)

The key benefit is that /var is now a real tmpfs, allowing podman to
use overlayfs for container storage inside /var/lib/containers. With
the old approach, the nested overlayfs caused "too many levels of
symbolic links" errors.

Implementation uses systemd credentials to inject units that run in the
initramfs before switch-root:
- bcvk-etc-overlay.service: overlay on /sysroot/etc with index=off,metacopy=off
  to avoid virtiofs contention; ordered after initrd-parse-etc.service
- bcvk-var-ephemeral.service: copies /sysroot/var to tmpfs and bind mounts

Both units use ConditionPathExists=/etc/initrd-release to only run in
the initramfs context.

The execute service target is changed from default.target to
multi-user.target with ConditionPathExists=!/etc/initrd-release to
ensure it runs after switch-root, not in the initramfs.

This is Phase 1 of issue bootc-dev#22, making ephemeral VMs more bootc-like.
SELinux is still disabled (selinux=0); Phase 2 will add composefs
support to enable proper SELinux labeling.

Closes: bootc-dev#22 (Phase 1)
Assisted-by: OpenCode (Sonnet 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant