Skip to content

Add support for collection export#325

Merged
jfrancoa merged 3 commits into
mainfrom
jose/collection-export
Apr 16, 2026
Merged

Add support for collection export#325
jfrancoa merged 3 commits into
mainfrom
jose/collection-export

Conversation

@jfrancoa
Copy link
Copy Markdown
Contributor

@jfrancoa jfrancoa commented Apr 8, 2026

What's being changed

Adds Helm chart support for Weaviate's collection export feature, introduced in weaviate/weaviate#10958.

values.yaml — new top-level collectionExport section (mirrors the pattern of backups and offload):

  • collectionExport.enabled — gates the feature; when true, injects EXPORT_ENABLED=true into the pod
  • collectionExport.envconfig.EXPORT_DEFAULT_BUCKET — required bucket name (defaults to weaviate-export); the bucket must exist before enabling, otherwise exports will fail
  • collectionExport.envconfig.EXPORT_DEFAULT_PATH — optional path prefix inside the bucket (commented out, defaults to empty)
  • collectionExport.envconfig.EXPORT_PARALLELISM — optional number of concurrent scan workers per export (commented out, defaults to 0 = GOMAXPROCS at runtime)

templates/weaviateStatefulset.yaml — renders the env vars when collectionExport.enabled=true; all other collectionExport.envconfig keys are forwarded as-is, so future env vars require no template changes.

.cicd/test.sh — test cases covering:

  • EXPORT_ENABLED and EXPORT_DEFAULT_BUCKET are absent by default
  • Both are injected when the feature is enabled
  • Custom bucket name overrides the default
  • EXPORT_DEFAULT_PATH and EXPORT_PARALLELISM are absent by default but can be set via envconfig
  • Pass the namespace in certain tests, as otherwise the CI will start failing once the github runners upgrade from Helm v3 to v4.

@jfrancoa jfrancoa requested a review from a team as a code owner April 8, 2026 10:52
Copy link
Copy Markdown

@orca-security-eu orca-security-eu Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca

@jfrancoa jfrancoa requested a review from antas-marcin April 9, 2026 09:30
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 10, 2026
Introduces a dedicated COLLECTION_EXPORT env var (default false) that
enables collection export via the new collectionExport Helm values added
in weaviate/weaviate-helm#325. When enabled:

- MinIO is started and the weaviate-export bucket is created automatically
- collectionExport.enabled=true and EXPORT_DEFAULT_BUCKET=weaviate-export
  are set in Helm
- If ENABLE_BACKUP is not also set, the backup-s3 module is automatically
  configured to point to MinIO (collection export uses it as its S3
  backend); backups.s3.secrets are omitted when S3_OFFLOAD is active to
  avoid the awsSecret.yaml multi-source credential guard in weaviate-helm
- action.yml gains a collection-export input
- CI all-params job enables COLLECTION_EXPORT=true and verifies a collection
  export can be created via POST /v1/export/s3
- Operating skill updated with COLLECTION_EXPORT=true usage, deployment
  pattern, and env-var table entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 10, 2026
Introduces a dedicated COLLECTION_EXPORT env var (default false) that
enables collection export via the new collectionExport Helm values added
in weaviate/weaviate-helm#325. When enabled:

- MinIO is started and the weaviate-export bucket is created automatically
- collectionExport.enabled=true and EXPORT_DEFAULT_BUCKET=weaviate-export
  are set in Helm
- If ENABLE_BACKUP is not also set, the backup-s3 module is automatically
  configured to point to MinIO (collection export uses it as its S3
  backend); backups.s3.secrets are omitted when S3_OFFLOAD is active to
  avoid the awsSecret.yaml multi-source credential guard in weaviate-helm
- action.yml gains a collection-export input
- CI all-params job enables COLLECTION_EXPORT=true and verifies a collection
  export can be created via POST /v1/export/s3
- Operating skill updated with COLLECTION_EXPORT=true usage, deployment
  pattern, and env-var table entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 10, 2026
- Backup job: use latest Weaviate version instead of hardcoded 1.26.3
  (collection export requires a recent version)
- Both backup and all-params jobs: set HELM_BRANCH to
  jose/collection-export since the collectionExport Helm values are
  not yet released (weaviate/weaviate-helm#325)
- Added TODO comments to revert HELM_BRANCH to 'main' once the helm
  chart is released

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a new option, collectionExport, in the values which
allows enabling collection export. Also, exposes the env
var EXPORT_DEFAULT_BUCKET which points at the right bucket
to export collections in.
A new env var got added to specify the PATH.
Adding support on the helm charts.
@jfrancoa jfrancoa force-pushed the jose/collection-export branch 2 times, most recently from 0ca17a0 to 2f3be9e Compare April 15, 2026 09:39
Comment thread weaviate/values.yaml Outdated
@jfrancoa jfrancoa force-pushed the jose/collection-export branch from 2f3be9e to 80e0e82 Compare April 16, 2026 10:51
- Add EXPORT_PARALLELISM commented-out env var to collectionExport config
- Add tests for EXPORT_PARALLELISM (absent by default, settable via envconfig)
- Fix pre-existing test failures for TRANSFORMERS_PASSAGE/QUERY_INFERENCE_API
  by passing --namespace default so .Release.Namespace resolves to "default"
  instead of a random UUID generated by helm template
- Make EXPORT_DEFAULT_PATH optional and adapt tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jfrancoa jfrancoa force-pushed the jose/collection-export branch from 80e0e82 to c6d90d3 Compare April 16, 2026 11:00
Comment thread weaviate/values.yaml

# Required setting. Bucket path in which to save exports. Defaults to empty string.
# Set this option if you want to save exports to a given path inside the bucket. Must be a valid bucket path.
EXPORT_DEFAULT_PATH: ""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default we will be setting EXPORT_DEFAULT_PATH empty? shouldn't this be commented as EXPORT_PARALLELISM?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the env var is required now. I did try enabling the feature the other day without the env var set and it was faling with this error.
That's what Dirk mentioned by saying that the env var is now required, but it can be default to "". If we don't pass it like that, the cluster will error and won't enable the feature.

@jfrancoa jfrancoa merged commit fa52795 into main Apr 16, 2026
3 of 4 checks passed
@jfrancoa jfrancoa deleted the jose/collection-export branch April 16, 2026 13:11
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 17, 2026
- Backup job: use latest Weaviate version instead of hardcoded 1.26.3
  (collection export requires a recent version)
- Both backup and all-params jobs: set HELM_BRANCH to
  jose/collection-export since the collectionExport Helm values are
  not yet released (weaviate/weaviate-helm#325)
- Added TODO comments to revert HELM_BRANCH to 'main' once the helm
  chart is released

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 20, 2026
- Backup job: use latest Weaviate version instead of hardcoded 1.26.3
  (collection export requires a recent version)
- Both backup and all-params jobs: set HELM_BRANCH to
  jose/collection-export since the collectionExport Helm values are
  not yet released (weaviate/weaviate-helm#325)
- Added TODO comments to revert HELM_BRANCH to 'main' once the helm
  chart is released

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 20, 2026
- Backup job: use latest Weaviate version instead of hardcoded 1.26.3
  (collection export requires a recent version)
- Both backup and all-params jobs: set HELM_BRANCH to
  jose/collection-export since the collectionExport Helm values are
  not yet released (weaviate/weaviate-helm#325)
- Added TODO comments to revert HELM_BRANCH to 'main' once the helm
  chart is released

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 20, 2026
Introduces a dedicated COLLECTION_EXPORT env var (default false) that
enables collection export via the new collectionExport Helm values added
in weaviate/weaviate-helm#325. When enabled:

- MinIO is started and the weaviate-export bucket is created automatically
- collectionExport.enabled=true and EXPORT_DEFAULT_BUCKET=weaviate-export
  are set in Helm
- If ENABLE_BACKUP is not also set, the backup-s3 module is automatically
  configured to point to MinIO (collection export uses it as its S3
  backend); backups.s3.secrets are omitted when S3_OFFLOAD is active to
  avoid the awsSecret.yaml multi-source credential guard in weaviate-helm
- action.yml gains a collection-export input
- CI all-params job enables COLLECTION_EXPORT=true and verifies a collection
  export can be created via POST /v1/export/s3
- Operating skill updated with COLLECTION_EXPORT=true usage, deployment
  pattern, and env-var table entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jfrancoa added a commit to weaviate/weaviate-local-k8s that referenced this pull request Apr 20, 2026
- Backup job: use latest Weaviate version instead of hardcoded 1.26.3
  (collection export requires a recent version)
- Both backup and all-params jobs: set HELM_BRANCH to
  jose/collection-export since the collectionExport Helm values are
  not yet released (weaviate/weaviate-helm#325)
- Added TODO comments to revert HELM_BRANCH to 'main' once the helm
  chart is released

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants