Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Chart.lock
**/secrets.yml
values-local.yaml
values-local.yml
values-*.yaml
values-*.yml

# Helm output and temporary files
*.tmp
Expand All @@ -31,4 +33,3 @@ test-output/
manifests/
rendered/
debug/

27 changes: 27 additions & 0 deletions braintrust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,17 @@ brainstore:

**Supported machine families:** c4, c4d

If you need the request to cover more than the cache volume alone, set an explicit total pod-local storage budget:

```yaml
brainstore:
reader:
volume:
size: "900Gi"
ephemeralStorage:
request: "905Gi" # cache + /tmp (if enabled) + logs/writable-layer overhead
```

### GKE Standard Mode

For Standard mode clusters, create node pools with local SSDs, then deploy:
Expand Down Expand Up @@ -147,6 +158,18 @@ For Standard mode clusters, create node pools with local SSDs, then deploy:
- Local SSDs are automatically available via emptyDir volumes
- Pod anti-affinity ensures readers and writers don't share nodes (each pod gets dedicated node access)

## AWS EKS Local Storage

On EKS, Brainstore uses Kubernetes-managed `emptyDir` volumes for cache storage. To make scheduling reflect the real local-disk budget, set `brainstore.<role>.ephemeralStorage.request` for each Brainstore role.

Size the request for the pod's full local-storage usage:
- cache `emptyDir`
- optional `/tmp` `emptyDir`
- container logs
- writable layer overhead

When you enable `tmpVolume`, make sure the `ephemeralStorage.request` still covers that extra space.

## Testing

This Helm chart includes comprehensive automated unit tests.
Expand Down Expand Up @@ -192,3 +215,7 @@ This version also adds first-class `brainstoreWalFooterVersion` support and auto
## Example Values Files

Example values files for different cloud providers and configurations are located in the `examples/` folder.

- `examples/google-autopilot/values.yaml`: GKE Autopilot deployment.
- `examples/google-autopilot-cel/values.yaml`: GKE Autopilot deployment with CEL-friendly security settings.
- `examples/google-standard/values.yaml`: GKE Standard deployment.
163 changes: 163 additions & 0 deletions braintrust/examples/google-autopilot-cel/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# Sample values for GKE Autopilot deployment with CEL policy compliance

global:
orgName: "<your Braintrust org name>"
namespace: "braintrust"

cloud: "google"

google:
mode: "autopilot"
autopilotMachineFamily: "c4"

objectStorage:
google:
brainstoreBucket: "<your brainstore bucket name>"
apiBucket: "<your api bucket name>"

api:
name: "braintrust-api"
# Uncomment the following section to use a different image or tag from the version in the Helm release
#image:
#repository: public.ecr.aws/braintrust/standalone-api
#tag: "<your image tag>"
annotations:
service:
networking.gke.io/load-balancer-type: "Internal"
replicas: 4
service:
type: LoadBalancer
port: 8000
portName: http
serviceAccount:
name: "braintrust-api"
googleServiceAccount: "<your Braintrust API Google service account>"
enableGcsAuth: false
resources:
requests:
cpu: "4"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
securityContext:
readOnlyRootFilesystem: true
Comment thread
soldatchenko marked this conversation as resolved.
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
tmpVolume:
enabled: true
sizeLimit: "1Gi"
extraEnvVars:
- name: AWS_REGION
value: "us-central1"

brainstore:
serviceAccount:
name: "brainstore"
googleServiceAccount: "<your Braintrust Brainstore Google service account>"
# Uncomment the following section to use a different image or tag from the version in the Helm release
#image:
#repository: public.ecr.aws/braintrust/brainstore
#tag: "<your image tag>"
locksBackend: "objectStorage"

reader:
name: "brainstore-reader"
replicas: 2
service:
name: ""
type: ClusterIP
port: 4000
portName: http
resources:
requests:
cpu: "16"
memory: "32Gi"
limits:
cpu: "16"
memory: "32Gi"
cacheDir: "/mnt/tmp/brainstore"
objectStoreCacheMemoryLimit: "1Gi"
objectStoreCacheFileSize: "900Gi"
verbose: true
securityContext:
readOnlyRootFilesystem: true
Comment thread
soldatchenko marked this conversation as resolved.
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volume:
size: "1000Gi"
sizeLimit: "900Gi"
Comment thread
soldatchenko marked this conversation as resolved.
tmpVolume:
enabled: true
sizeLimit: "1Gi"
extraEnvVars:

fastreader:
name: "brainstore-fastreader"
replicas: 2
service:
name: ""
type: ClusterIP
port: 4000
portName: http
resources:
requests:
cpu: "16"
memory: "32Gi"
limits:
cpu: "16"
memory: "32Gi"
cacheDir: "/mnt/tmp/brainstore"
objectStoreCacheMemoryLimit: "1Gi"
objectStoreCacheFileSize: "900Gi"
verbose: true
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volume:
size: "1000Gi"
sizeLimit: "900Gi"
tmpVolume:
enabled: true
sizeLimit: "1Gi"
extraEnvVars:

writer:
name: "brainstore-writer"
replicas: 1
service:
name: ""
type: ClusterIP
port: 4000
portName: http
resources:
requests:
cpu: "32"
memory: "64Gi"
limits:
cpu: "32"
memory: "64Gi"
cacheDir: "/mnt/tmp/brainstore"
objectStoreCacheMemoryLimit: "1Gi"
objectStoreCacheFileSize: "900Gi"
verbose: true
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volume:
size: "1000Gi"
sizeLimit: "900Gi"
tmpVolume:
enabled: true
sizeLimit: "1Gi"
extraEnvVars:
31 changes: 31 additions & 0 deletions braintrust/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,34 @@ Static fast reader query sources used by API.
-}}
{{- join "," $sources -}}
{{- end -}}

{{/*
Render Brainstore container resources with provider-specific ephemeral storage.

Google Autopilot keeps the legacy behavior of defaulting the ephemeral-storage
request to volume.size when no explicit total request is set. AWS EKS requires
an explicit total pod-local storage budget that includes cache, optional /tmp,
and normal writable-layer/log overhead.
*/}}
{{- define "braintrust.brainstoreResources" -}}
{{- $root := .root -}}
{{- $resources := deepCopy .resources -}}
{{- $supportsEphemeralStorage := or (eq $root.Values.cloud "aws") (and (eq $root.Values.cloud "google") (eq $root.Values.google.mode "autopilot")) -}}
{{- $request := "" -}}
{{- if and .ephemeralStorage .ephemeralStorage.request -}}
{{- $request = .ephemeralStorage.request -}}
{{- else if and (eq $root.Values.cloud "google") (eq $root.Values.google.mode "autopilot") .volumeSize -}}
{{- $request = .volumeSize -}}
{{- end -}}
{{- if and $supportsEphemeralStorage $request -}}
{{- $requests := deepCopy (default (dict) $resources.requests) -}}
{{- $_ := set $requests "ephemeral-storage" $request -}}
{{- $_ := set $resources "requests" $requests -}}
{{- end -}}
{{- if and $supportsEphemeralStorage .ephemeralStorage .ephemeralStorage.limit -}}
{{- $limits := deepCopy (default (dict) $resources.limits) -}}
{{- $_ := set $limits "ephemeral-storage" .ephemeralStorage.limit -}}
{{- $_ := set $resources "limits" $limits -}}
{{- end -}}
{{- toYaml $resources -}}
{{- end -}}
28 changes: 25 additions & 3 deletions braintrust/templates/api-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
{{- end }}
spec:
serviceAccountName: {{ .Values.api.serviceAccount.name }}
{{- with .Values.api.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.api.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
Expand All @@ -60,6 +64,10 @@ spec:
- name: api
image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
imagePullPolicy: {{ .Values.api.image.pullPolicy }}
{{- with .Values.api.securityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
ports:
- containerPort: {{ .Values.api.service.port }}
resources:
Expand Down Expand Up @@ -122,17 +130,32 @@ spec:
{{- if .Values.api.extraEnvVars }}
{{- toYaml .Values.api.extraEnvVars | nindent 12 }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
{{- if or .Values.api.tmpVolume.enabled (and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver) }}
volumeMounts:
{{- if .Values.api.tmpVolume.enabled }}
- name: tmp-volume
mountPath: /tmp
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
mountPath: "/mnt/secrets-store"
readOnly: true
{{- end }}
{{- end }}
{{- with .Values.api.extraContainers }}
{{- toYaml . | nindent 8 }}
{{- end }}
volumes:
{{- if or (and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver) .Values.api.extraVolumes }}
{{- if or .Values.api.tmpVolume.enabled (and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver) .Values.api.extraVolumes }}
{{- if .Values.api.tmpVolume.enabled }}
- name: tmp-volume
emptyDir:
{{- if .Values.api.tmpVolume.sizeLimit }}
sizeLimit: {{ .Values.api.tmpVolume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
csi:
Expand All @@ -147,4 +170,3 @@ spec:
{{- else }}
[]
{{- end }}

39 changes: 33 additions & 6 deletions braintrust/templates/brainstore-fastreader-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
{{- end }}
spec:
serviceAccountName: {{ .Values.brainstore.serviceAccount.name }}
{{- with .Values.brainstore.fastreader.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if or .Values.brainstore.fastreader.nodeSelector (and (eq .Values.cloud "google") (eq .Values.google.mode "autopilot")) }}
nodeSelector:
{{- with .Values.brainstore.fastreader.nodeSelector }}
Expand All @@ -67,16 +71,21 @@ spec:
- name: brainstore-fastreader
image: "{{ .Values.brainstore.image.repository }}:{{ .Values.brainstore.image.tag }}"
imagePullPolicy: {{ .Values.brainstore.image.pullPolicy }}
{{- with .Values.brainstore.fastreader.securityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
command: ["brainstore"]
args: ["web"]
ports:
- containerPort: {{ .Values.brainstore.fastreader.service.port }}
resources:
{{- $resources := .Values.brainstore.fastreader.resources }}
{{- if and (eq .Values.cloud "google") (eq .Values.google.mode "autopilot") .Values.brainstore.fastreader.volume.size }}
{{- $resources = merge (dict "requests" (merge $resources.requests (dict "ephemeral-storage" .Values.brainstore.fastreader.volume.size))) $resources }}
{{- end }}
{{- toYaml $resources | nindent 12 }}
{{- include "braintrust.brainstoreResources" (dict
"root" .
"resources" .Values.brainstore.fastreader.resources
"volumeSize" .Values.brainstore.fastreader.volume.size
"ephemeralStorage" .Values.brainstore.fastreader.ephemeralStorage
) | nindent 12 }}
{{- with .Values.brainstore.livenessProbe }}
livenessProbe:
{{- toYaml . | nindent 12 }}
Expand Down Expand Up @@ -134,6 +143,10 @@ spec:
volumeMounts:
- name: cache-volume
mountPath: {{ .Values.brainstore.fastreader.cacheDir }}
{{- if .Values.brainstore.fastreader.tmpVolume.enabled }}
- name: tmp-volume
mountPath: /tmp
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
mountPath: "/mnt/secrets-store"
Expand All @@ -155,8 +168,22 @@ spec:
requests:
storage: {{ required "brainstore.fastreader.volume.size must be set" .Values.brainstore.fastreader.volume.size | quote }}
{{- else }}
emptyDir: {}
emptyDir:
{{- if .Values.brainstore.fastreader.volume.sizeLimit }}
sizeLimit: {{ .Values.brainstore.fastreader.volume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if .Values.brainstore.fastreader.tmpVolume.enabled }}
- name: tmp-volume
emptyDir:
{{- if .Values.brainstore.fastreader.tmpVolume.sizeLimit }}
sizeLimit: {{ .Values.brainstore.fastreader.tmpVolume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
csi:
Expand Down
Loading
Loading