feat: Separate the storage-users and graph to handle vault storage#559
feat: Separate the storage-users and graph to handle vault storage#559
Conversation
b6bc6fa to
b4cd50a
Compare
b1ac85b to
a145e18
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces storage-id–aware routing and event handling to better support a dedicated “vault” storage provider, while also simplifying/removing some gateway caching layers.
Changes:
- Add
storage_idpropagation/filtering in gateway + spaces registry, and introduce a vault storage provider ID constant. - Make decomposedfs async event consumption configurable via
events.consumer_group, and ignore postprocessing events that target a different storage. - Remove the gateway “create home / create personal space” caching types and client wrappers, and refactor some client acquisition to use
pooldirectly.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/utils/utils.go | Adds VaultStorageProviderID constant. |
| pkg/storage/utils/decomposedfs/options/options.go | Adds events.consumer_group option with default dcfs. |
| pkg/storage/utils/decomposedfs/decomposedfs.go | Uses configurable consumer group and filters postprocessing events by storage id. |
| pkg/storage/registry/spaces/spaces.go | Adds vault filtering and threads storage_id into provider selection/filtering. |
| pkg/storage/cache/cache.go | Removes CreateHome/CreatePersonalSpace cache interfaces and registries. |
| pkg/storage/cache/createhome.go | Deletes legacy create-home cache implementation. |
| pkg/storage/cache/createpersonalspace.go | Deletes legacy create-personal-space cache implementation. |
| pkg/events/postprocessing.go | Adds ResourceID to postprocessing events. |
| internal/http/services/owncloud/ocs/handlers/apps/sharing/shares/spaces.go | Simplifies provider client acquisition via pool. |
| internal/grpc/services/storageprovider/storageprovider.go | Ensures storage id is added to RootInfo.Id in Stat responses. |
| internal/grpc/services/gateway/storageprovidercache.go | Updates cache keying to include storage_id (though wrapper usage changed elsewhere). |
| internal/grpc/services/gateway/storageprovider.go | Propagates storage_id and switches to direct pool clients (removing cached wrappers). |
| internal/grpc/services/gateway/gateway.go | Removes create-personal-space cache config/field wiring. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // when a specific space type is requested we may skip this provider altogether if it is not configured for that type | ||
| // we have to ignore a space type filter with +grant or +mountpoint type because they can live on any provider | ||
| if requestedSpaceType != "" && !strings.HasPrefix(requestedSpaceType, "+") { | ||
| found := false | ||
| if storageId != "" && storageId != provider.ProviderID { | ||
| // skip mismatching storageproviders | ||
| continue | ||
| } | ||
| for spaceType := range provider.Spaces { |
| switch ev := event.Event.(type) { | ||
| case events.PostprocessingFinished: | ||
| sublog := log.With().Str("event", "PostprocessingFinished").Str("uploadid", ev.UploadID).Logger() | ||
| if ev.ResourceID != nil && ev.ResourceID.GetStorageId() != "" && ev.ResourceID.GetStorageId() != fs.o.MountID { |
| sublog := log.With().Str("event", "PostprocessingFinished").Str("uploadid", ev.UploadID).Logger() | ||
| if ev.ResourceID != nil && ev.ResourceID.GetStorageId() != "" && ev.ResourceID.GetStorageId() != fs.o.MountID { | ||
| sublog.Debug().Msg("ignoring event for different storage") | ||
| continue | ||
| } |
38c8b71 to
1130aeb
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces storage-id–aware routing and event handling to support a dedicated “vault” storage alongside existing storage providers (related to owncloud/ocis#12108). It propagates a storage_id signal through gateway/registry flows and adds event-consumer separation to avoid cross-storage interference.
Changes:
- Add a vault storage provider ID constant and propagate
storage_idthrough gateway requests. - Make decomposedfs async postprocessing consumer group configurable and ignore postprocessing events from other storages.
- Update spaces registry/provider discovery and gateway provider-cache keying to consider
storage_id.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/utils/utils.go | Adds a Vault storage provider ID constant. |
| pkg/storage/utils/decomposedfs/options/options.go | Adds events.consumer_group config + defaulting. |
| pkg/storage/utils/decomposedfs/decomposedfs.go | Uses configurable consumer group; filters postprocessing events by storage. |
| pkg/storage/registry/spaces/spaces.go | Filters vault spaces when storage_id absent; threads storage_id into provider selection. |
| pkg/events/postprocessing.go | Extends postprocessing events with ResourceID. |
| internal/http/services/owncloud/ocs/handlers/apps/sharing/shares/spaces.go | Simplifies provider client acquisition (direct pool call). |
| internal/grpc/services/storageprovider/storageprovider.go | Ensures storage provider IDs are filled in additional metadata IDs. |
| internal/grpc/services/gateway/storageprovidercache.go | Makes provider-cache key storage-aware via storage_id. |
| internal/grpc/services/gateway/storageprovider.go | Passes storage_id through CreateHome/CreateStorageSpace and ListStorageSpaces. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Filter out vault spaces if no storageId is provided | ||
| if space.GetRoot().GetStorageId() != "" { | ||
| if space.GetRoot().GetStorageId() != provider.ProviderID { | ||
| continue | ||
| } | ||
| } else { | ||
| if strings.HasPrefix(sc.MountPoint, "/vault/") { | ||
| continue | ||
| } | ||
| } |
There was a problem hiding this comment.
This vault-space filter checks strings.HasPrefix(sc.MountPoint, "/vault/"), but MountPoint is documented as potentially being a regex (and it may also be configured as /vault without the trailing slash). In those cases this filter won't behave as intended. Consider filtering based on the resolved spacePath (or introducing an explicit config flag for vault spaces) instead of using a prefix check on the mountpoint pattern.
| // Filter out vault spaces if no storageId is provided | ||
| if storageId == "" && strings.HasPrefix(sc.MountPoint, "/vault/") { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Same as above: vault-space filtering uses strings.HasPrefix(sc.MountPoint, "/vault/") even though MountPoint can be a regex / may not have a trailing slash. This can lead to vault spaces leaking into results when storageId == "". Consider basing the check on the resolved spacePath (or an explicit vault flag) instead of the mountpoint pattern string.
cc61754 to
a90b037
Compare
update ListStorageProviders cache update findProvidersForFilter fix the finishUpload event restict webdav copy/move from the vault
01eba14 to
1b5ed3c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| continue | ||
| } | ||
| } else { | ||
| if strings.HasPrefix(sc.MountPoint, "/vault/") { |
There was a problem hiding this comment.
spaceConfig.MountPoint is documented as "Can be a regex", but this new vault filtering uses strings.HasPrefix(sc.MountPoint, "/vault/"). If mount points are configured as regexes (e.g. with ^), this filter won't match and vault spaces may leak into results. Prefer filtering by provider/storage ID (e.g. provider.ProviderID == utils.VaultStorageProviderID) or by matching against the resolved spacePath instead of the raw MountPoint string.
| if strings.HasPrefix(sc.MountPoint, "/vault/") { | |
| if provider.ProviderID == utils.VaultStorageProviderID { |
| continue | ||
| } | ||
| // Filter out vault spaces if no storageId is provided | ||
| if storageId == "" && strings.HasPrefix(sc.MountPoint, "/vault/") { |
There was a problem hiding this comment.
spaceConfig.MountPoint is documented as "Can be a regex", but this new vault filtering uses strings.HasPrefix(sc.MountPoint, "/vault/"). If mount points are configured as regexes, this check can fail and vault spaces may appear unexpectedly. Prefer filtering by provider/storage ID (e.g. provider.ProviderID == utils.VaultStorageProviderID) or by matching against the resolved spacePath.
| if storageId == "" && strings.HasPrefix(sc.MountPoint, "/vault/") { | |
| if storageId == "" && provider.ProviderID == utils.VaultStorageProviderID { |
| package auth | ||
|
|
||
| import ( |
There was a problem hiding this comment.
This new file is missing the standard Apache license header that other files in this package include (e.g. internal/grpc/interceptors/auth/scope.go:1-17). Please add the repository’s usual header block at the top for consistency/compliance.
| package ctx | ||
|
|
||
| // MFAOutgoingHeader is the gRPC metadata key used to propagate MFA status across |
There was a problem hiding this comment.
This new file is missing the standard Apache license header that other files in pkg/ctx include (e.g. pkg/ctx/userctx.go:1-17). Please add the usual header block at the top for consistency/compliance.
| } | ||
|
|
||
| func (s *svc) prepareCopy(ctx context.Context, w http.ResponseWriter, r *http.Request, srcRef, dstRef *provider.Reference, log *zerolog.Logger, destInShareJail bool) *copy { | ||
| // restict copy from the vault |
There was a problem hiding this comment.
This comment has a typo: "restict" → "restrict".
| // restict copy from the vault | |
| // restrict copy from the vault |
| case errtypes.PermissionDenied: | ||
| if strings.Contains(err.Error(), "MFA required") { | ||
| rw.Header().Set("X-Ocis-Mfa-Required", "true") | ||
| rw.WriteHeader(http.StatusForbidden) | ||
| } else { | ||
| rw.WriteHeader(http.StatusNotFound) | ||
| } |
There was a problem hiding this comment.
Detecting the MFA condition via strings.Contains(err.Error(), "MFA required") is brittle (message wording changes will silently break the behavior). Consider using a dedicated error type/sentinel (e.g., errors.Is), or checking a structured status/detail that the MFA interceptor can set, so this remains stable across refactors/localization.
| // service boundaries. The "autoprop-" prefix causes the metadata interceptor | ||
| // (internal/grpc/interceptors/metadata) to forward it automatically at every | ||
| // gRPC hop, so no manual re-forwarding is required. | ||
| // The const rgrpc.AutoPropPrefix causes the cycle import |
There was a problem hiding this comment.
Grammar nit: "causes the cycle import" should be "would cause a cyclic import" (or similar).
| // The const rgrpc.AutoPropPrefix causes the cycle import | |
| // Using rgrpc.AutoPropPrefix here would cause a cyclic import. |
| ctx = grantMFAForServiceAccount(ctx, u) | ||
| if conf.MFAEnabled { | ||
| if mfav := metadata.ValueFromIncomingContext(ctx, ctxpkg.MFAOutgoingHeader); !slices.Contains(mfav, "true") { | ||
| log.Warn().Err(err).Msg("access token is invalid: MFA is required") | ||
| return mfaResponse(ctx, req, info) | ||
| } | ||
| } |
There was a problem hiding this comment.
grantMFAForServiceAccount appends the MFA flag to the outgoing context, but the MFA gate checks metadata.ValueFromIncomingContext(...). As a result, service accounts will still fail the MFA check even though they should be implicitly allowed. Fix by bypassing the MFA check for USER_TYPE_SERVICE, or by injecting the value into the incoming metadata (e.g., wrap the context with a new incoming metadata.MD).
| if conf.MFAEnabled { | ||
| if mfav := metadata.ValueFromIncomingContext(ctx, ctxpkg.MFAOutgoingHeader); !slices.Contains(mfav, "true") { | ||
| log.Warn().Err(err).Msg("access token is invalid: MFA is required") | ||
| return mfaResponse(ctx, req, info) | ||
| } | ||
| } |
There was a problem hiding this comment.
I think this should be handled by the service. Some services might allow access to some parts to anyone while accessing to other parts might require additional privileges.
There was a problem hiding this comment.
You are right, but it is out of the vault storage scope in my opinion.
There was a problem hiding this comment.
I guess we can leave it for now... all the services should have the MFA disabled except the vault, so it should be fine in the short term. Maybe a TODO comment as a reminder to move this piece is good enough.
| // therefore never carry an acr/MFA claim. Granting them implicit MFA allows | ||
| // them to access MFA-gated resources such as vault storage without | ||
| // compromising the MFA requirement for regular users. | ||
| func grantMFAForServiceAccount(ctx context.Context, u *userpb.User) context.Context { |
There was a problem hiding this comment.
I think this is the wrong place to set this up.
The same way we get the MFA claim from keycloak, each command should set the MFA claim. I mean, if the command needs access to an MFA-protected resource, the command should get a "token" (in this case, just setting the context variable) and the system should propagate the information the same way it does for the regular user.
| w.WriteHeader(http.StatusConflict) | ||
| b, err := errors.Marshal(http.StatusBadRequest, "destination is not allowed", "", "") |
There was a problem hiding this comment.
I think we should use the same error code, and StatusConflict doesn't seem a good one to use here. If there is anything else in place (HandleWebdavError might do things), we probably need some comment explaining why the errors are different.
| // Downstream services (e.g. the mfa gRPC interceptor) can use this | ||
| // as a secondary MFA proof independent of the X-Multi-Factor-Authentication | ||
| // HTTP header propagation path. | ||
| if acr, ok := claims["acr"].(string); ok && acr != "" { |
There was a problem hiding this comment.
Is this needed? In theory, the acr should be handled by the oCIS proxy, so unless there is some direct access to a reva resource this shouldn't be needed. I'm fine with keeping it, although I'd like to know the use case.
Separate the storage-users and graph to handle vault storage
Related to owncloud/ocis#12108