feat: KEEP-177 add Safe wallet integration#923
Conversation
Adds three tables to persist Safe smart-account state per organization: - safe_wallets: one row per (org, chain) Safe deployment, owned by the org's Turnkey EOA at threshold 1. Stores address, salt nonce, deploy tx hash, factory/singleton pointers. - safe_module_installations: tracks which Safe modules are enabled per Safe. module_type is free-form text (seeded with "allowance") so future modules (Zodiac Roles, Transaction Guard) can coexist without schema churn. - safe_token_limits: per (safe, delegate, token) spending caps enforced by the Allowance Module. amount_wei is text-encoded uint256, period_minutes maps to the module's resetTimeMin field. All three are covered by the new 0052_typical_zarek migration chained after staging's 0051_rename_aave_slug_to_v3.
Introduces lib/safe/contracts.ts as the single source of truth for Safe's deployed contracts. All addresses are Nick's deterministic deployer addresses, identical across every EVM chain we support (Ethereum, Optimism, Base, Arbitrum), so one canonical map covers every target chain. - Registers Safe v1.4.1 singleton (L2 variant for event emission), proxy factory, compatibility fallback handler, MultiSend, MultiSendCallOnly - Registers the Safe Allowance Module v0.1.0 for later use in the spending-limits flow - Exposes getSafeContracts(chainId), getSafeSingletonForDeploy, and getAllowanceModuleAddress helpers with a CHAIN_OVERRIDES hook for the rare chain that ships non-canonical deployments - SUPPORTED_SAFE_CHAIN_IDS + isSafeSupportedChain() narrow chain IDs to the four we gate Safe operations to today Source: github.com/safe-global/safe-deployments
Adds the deploy flow for organization Safes: one Safe per (org, chain), owned by the org's Turnkey EOA at threshold 1. Same CREATE2 salt produces the same Safe address on every supported chain. - lib/safe/address.ts: pure helpers for setup() calldata, salt nonce derivation (keccak256 of org + chain), and ProxyCreation event parsing - lib/safe/deployment.ts: deployOrgSafe orchestrator that reuses the existing transaction-manager pipeline (nonce session, adaptive gas, RPC failover). Idempotent via (org, chain) uniqueness - app/api/user/safe/route.ts: admin-gated GET list + POST deploy - components/safe/deploy-safe-card.tsx: chain picker + deploy button + list of deployed Safes, rendered in the wallet overlay - tests/unit/safe-address.test.ts: 12 tests covering calldata encoding, salt determinism, and event parsing
Layers on-chain spending caps on top of deployed Safes via Safe's canonical Allowance Module. Limits are per (org, chain, token), enforced by the module contract, and show natively under the Spending Limits tab on app.safe.global. - lib/safe/allowance-module.ts: pure calldata helpers for enableModule, addDelegate, setAllowance, deleteAllowance, and the pre-validated-signature execTransaction wrapper for threshold-1 Safes - lib/safe/modules.ts: install + set + revoke orchestrators, plus listSafeAllowancesWithChainState which reads the module on-chain on every list call and drops DB rows that were externally revoked (reconcile-on-read) - lib/safe/auth.ts: validateSafeAdmin + getSafeForOrg helpers shared across the allowance routes - Three admin-gated API routes: POST /modules/allowance (install), GET+POST /allowances (list+set), DELETE /allowances/:token (revoke) - components/safe/spending-limits-card.tsx: renders per-safe limits with live "spent / cap", period label, and reset countdown; Add + Edit dialog share the same POST endpoint; Revoke calls DELETE - Wires the card into DeploySafeCard for every deployed Safe - tests/unit/safe-allowance.test.ts: 12 tests covering calldata round-trips, uint96/uint16 bounds, pre-validated sig format, and getTokenAllowance decoding
logSystemError forwards its context object into a Prometheus counter whose label set is fixed at init time: error_category, error_context, is_user_error, error_type, plugin_name, action_name, service, chain_id, table, endpoint, component, workflow_id, execution_id, integration_id, status_code. The Safe code was passing organizationId and operation, neither of which is in that label set, which caused prom-client to throw: Added label "organization_id" is not included in initial labelset Moves the org/safe/token identifiers into the log message string so they still appear in Sentry/Loki output, and replaces the operation label with the existing component convention. Applies to all four Safe API routes and the three orchestrator catch blocks.
Adds every EVM mainnet + testnet where (a) KeeperHub already has RPC and chain metadata seeded and (b) Safe v1.4.1 + Allowance Module are deployed at the canonical Nick's-deployer addresses. Mainnets added: BNB Smart Chain (56), Polygon (137), Avalanche (43114). Testnets added: Sepolia (11155111), Base Sepolia (84532), Arbitrum Sepolia (421614), BSC testnet (97), Polygon Amoy (80002), Avalanche Fuji (43113). Optimism (10) removed: it was in the previous SUPPORTED list but no KeeperHub chain seed row exists for it, so chainIsEnabled() would always reject it. - SUPPORTED_SAFE_CHAIN_IDS reworked with inline comments per chain - DeploySafeCard CHAIN_LABELS covers all new chains so the UI shows proper names instead of "Chain 56" - spending-limits-card COMMON_TOKENS pre-populates USDC/USDT/DAI for each mainnet and Circle testnet USDC for Sepolia / Base Sepolia / Arbitrum Sepolia. Remaining testnets (BSC / Amoy / Fuji) ship with empty arrays; admins can still deploy + enable module, just no preset token picker until Circle canonicalizes their testnets or we add chain-specific defaults
Commit 36e87a5 narrowed SUPPORTED_SAFE_CHAIN_IDS and also dropped Optimism from the CHAIN_LABELS map. If anything ever surfaces a chain-10 row (legacy Safe rows, stale caches, or the list is later reopened) the UI fell back to "Chain 10". Broadens CHAIN_LABELS to cover every chain we might render -- not just the deploy-supported subset -- so the label is always correct. Adds Optimism (10) and Optimism Sepolia (11155420) while I'm in here.
Wires Optimism (10) and Optimism Sepolia (11155420) end-to-end so chainIsEnabled() no longer rejects Safe deploys on these chains. - lib/rpc/rpc-config.ts: adds OP_MAINNET / OP_MAINNET_FALLBACK / OP_SEPOLIA / OP_SEPOLIA_FALLBACK public RPCs and the two CHAIN_CONFIG entries mapping chain IDs to the op-mainnet / op-sepolia keys - scripts/seed/seed-chains.ts: adds chain rows for both networks plus matching Etherscan V2 explorer configs and the canonical name map - lib/safe/contracts.ts: re-adds 10 and 11155420 to SUPPORTED_SAFE_CHAIN_IDS now that the backend resolves them; drops the "Optimism intentionally excluded" comment that is no longer true Seeding is idempotent; running pnpm seed-chains on any env picks up the new rows without impact on existing orgs.
logUserError forwards its context object into a Prometheus counter whose label set is fixed at init time: error_category, error_context, is_user_error, error_type, plugin_name, action_name, service, chain_id, table, endpoint, component, workflow_id, execution_id, integration_id, status_code. executeTransaction and executeContractTransaction were passing `nonce` and `method` which blow up with: Added label "nonce" is not included in initial labelset Moves those identifiers into the log message string so they still appear in Sentry/Loki output, and keeps only approved metric labels. Affects any write-tx error path, not just Safe; surfaced while testing the Safe deploy flow.
Three related fixes to the per-token spending-limits flow: Decimals: lib/contracts/tokens.ts had no Sepolia/Base-Sepolia/ Arbitrum-Sepolia/OP-Sepolia entries. When an admin set a limit on testnet USDC the server fell back to decimals=18, so "200 USDC" got stored as 200e6 wei paired with decimals=18 and the edit dialog later showed 0.0000000002. Adds the four Circle testnet USDC rows so new limits resolve correctly. Reconcile on read: the allowances GET now overlays fresh getTokenInfo output on each row so previously-miscoded rows (from before the fix above) display with correct symbol/decimals without a DB repair. Known tokens in the server registry always win over the DB cache. Case-insensitive catalog match: the DB stores lowercase addresses while the UI catalog uses EIP-55 checksummed ones, which broke the Edit dialog's <Select /> pre-fill. Normalizes both sides when matching and when pre-filling the selection. Custom token support: adds a "Custom token..." entry at the bottom of the picker that reveals address + symbol + decimals inputs. Validated client-side (address format, decimals range, symbol required); the POST body forwards the client-supplied metadata and the server uses it only when the token is not in its registry. Also adds USDS (Sky Protocol) to the Ethereum mainnet and Base catalogs.
# Conflicts: # components/overlays/wallet-overlay.tsx
Adds the per-(org, chain) signing switch infrastructure so workflow
writes can execute from the Safe instead of the Turnkey EOA. This
commit lands only the backend plumbing; UI toggle + balance display +
transfer step wiring follow in separate commits.
Schema:
- safe_wallets.is_signing_active boolean, default false. Additive,
zero-risk migration; every existing row defaults to eoa mode.
Resolver (lib/safe/signer-resolver.ts):
- resolveSignerMode(orgId, chainId) returns either "eoa" (current
behavior) or "safe" (Safe address + owner address) based on the
new flag plus status = "deployed".
Execution helper (lib/safe/execute-as-safe.ts):
- executeContractCallAsSafe and executeNativeTransferAsSafe wrap
an inner call in safe.execTransaction using the existing
buildExecTransactionCalldata + pre-validated signature plumbing
from the Allowance Module helpers. Turnkey EOA signs the outer
tx and still pays gas; on-chain msg.sender at the target becomes
the Safe.
API:
- PATCH /api/user/safe/[safeId] admin-gated toggle endpoint with
body { isSigningActive: boolean }. Rejects activation when the
Safe is not yet status = "deployed".
- GET /api/user/safe now includes is_signing_active per Safe.
Write-contract-core wiring:
- Resolves signer mode early. In safe mode we skip the ERC-4337
sponsored path (would change msg.sender away from the Safe) and
route through executeContractCallAsSafe inside the existing
nonce session. EOA path is unchanged.
Transfer-funds / transfer-token / approve-token cores still execute
from the EOA and will be wired in subsequent commits.
Adds the same EOA vs Safe branching landed for write-contract-core to the three remaining web3 write steps. In safe mode each step wraps the inner call in safe.execTransaction via executeContractCallAsSafe (or executeNativeTransferAsSafe for native value), skips the ERC-4337 sponsored path (the bundler would change msg.sender away from the Safe), and submits the outer tx with the Turnkey EOA as signer. - transfer-funds-core.ts: native transfer from the Safe's balance - transfer-token-core.ts: ERC-20 transfer from the Safe's token balance - approve-token-core.ts: ERC-20 approval where the Safe is the owner whose allowance the spender gets All four web3 writes now respect the per-(org, chain) signing toggle.
Ships the Phase 3 UI on top of the backend routing committed earlier.
SafeSigningToggle component:
- Per (org, chain) switch in each deployed Safe card
- PATCH /api/user/safe/[safeId] { isSigningActive }
- Admin-gated; disabled while in-flight; toasts on success/failure
- Explanatory copy about where funds need to live and who pays gas
DeploySafeCard rework:
- Each deployed Safe is an inner DeployedSafeRow component
- Safe address row has copy button, explorer link, and "View on Safe"
deep-link (app.safe.global/home?safe=<prefix>:<addr>)
- "Active signer" badge appears next to chain label when the toggle
is on
- Status chip uses the same muted badge style as other wallet cards
- "Deploy on new chain" is now a single button that opens a Dialog
with the chain picker, matching the Add-Token flow on the balances
tab
chain-prefixes module:
- getSafeAppUrl(chainId, address) maps to app.safe.global EIP-3770
slugs (eth, base, arb1, oeth, sep, basesep, etc.)
- getExplorerAddressUrl / getExplorerTxUrl return the canonical
Etherscan V2 / Snowtrace / BscScan URL; unknown chains return null
and the UI hides the link
SpendingLimitsCard polish:
- Limit rows are flex-col with three zones: amount line, period +
reset, token address + copy/explorer links
- Period labels capitalized (Daily / Weekly / Monthly)
- Token address renders as font-mono code with copy + explorer-link
icons, matching the wallet-address-card pattern
Tests:
- tests/unit/safe-signer-resolver.test.ts covers eoa/safe branching
based on row presence, status, and flag; plus chain-prefix helpers
- tests/unit/approve-token.test.ts mocks the two new Safe modules so
the existing tests continue to exercise the eoa path
- Filter the Add dialog's token dropdown to exclude tokens that already have a limit. Allowance Module stores one row per (delegate, token), so picking an already-limited token silently overwrote the existing one -- users should Edit an existing row instead. Custom token option still appears regardless. - Dropdown alignment: both Token and Period selects open with their start edge anchored to the trigger's left edge (align="start") rather than the default center-ish popper alignment.
Same fix applied to approve-token.test.ts -- the core now imports resolveSignerMode and executeContractCallAsSafe which transitively load ethers.Interface() at module init, and the test's lightweight ethers mock doesn't support the constructor shape. Add shallow mocks for the two Safe modules so the core's eoa path exercises normally and the CI test-unit job goes green.
Drops the ceremonial Allowance Module (it did not enforce anything on-chain beyond a per-token cap) in favour of the Zodiac Roles Modifier v2, which scopes every workflow call through a pre-audited permission tree: per-target contract allowlist, per-function selector allowlist, per-parameter conditions (recipient pinned to the Safe, token in the allowed set, amount within the role allowance). Adds: - lib/safe/roles-orchestrator.ts: installs Roles proxy, assigns role to Turnkey EOA, applies per-protocol permissions and per-token allowances in a single MultiSend. Flattens the new per-protocol TokenLimitInput[] payload with max-amount / min-period conflict resolution. - lib/safe/condition-templates.ts: per-protocol preset builders backed by karpatkey's defi-kit, plus a target-only fallback and a trivial WETH template. - lib/safe/protocol-registry.ts: 18-entry catalog with enforcement level (full / target-only) and per-chain availability. - lib/safe/protocol-targets.ts: per-chain target contract addresses consumed by the target-only fallback. - lib/safe/zodiac-contracts.ts / zodiac-roles.ts: canonical address registry, calldata builders, MultiSend packing. - lib/safe/simulate.ts: desired-role builder used by the SDK's diffing path. - lib/safe/chain-state.ts: on-chain reconciliation helpers for role and allowance state. - lib/safe/price-oracle.ts: Chainlink native/USD conversion for the simulate endpoint gas breakdown. - app/api/user/safe/[safeId]/role: GET / POST / allowances CRUD / simulate routes. Accept both the new ProtocolInput[] shape and the legacy shape during the wizard migration. Routes workflow steps through the Safe when signing is active: transfer-funds, transfer-token, approve-token, write-contract now delegate to the orchestrator path when the Safe has an active role. execute-as-safe and signer-resolver updated accordingly. Removes the superseded surface: - app/api/user/safe/[safeId]/allowances/* and modules/allowance - components/safe/spending-limits-card.tsx - lib/safe/modules.ts - tests/unit/safe-allowance.test.ts
…ement badges
Consolidates the deploy-with-policies flow and the post-deploy install
dialog behind one component, PolicyWizard. Both render the same UI for
the 18-protocol catalog and emit the same payload shape:
{ protocols: [{ slug, tokens: [{ tokenAddress, tokenSymbol,
tokenDecimals, amountHuman, periodSeconds }] }] }
Per-protocol cards show:
- Full policy / Target-only enforcement badge sourced from the catalog
- Target-contract list with explorer Verify links (built via
buildAddressUrl + the chain's explorer config)
- A structured token row editor: symbol badge, truncated address with
explorer link, amount input, Daily / Weekly / Monthly period select,
remove. Replaces the CSV-of-symbols textbox.
Token selection goes through a shared TokenPicker that merges the
system supportedTokens list, the org's organizationTokens for that
chain, and offers a Paste-custom-address fallback wired to
POST /api/user/wallet/tokens (reads symbol / name / decimals on-chain
and persists).
Deploy wizard deletes the duplicated policies step and renders the
shared PolicyWizard inside step 2, so deploy + install behave
identically. Review step surfaces applied / skipped / conflictedTokens
returned by the simulate and install endpoints.
Files:
- New: policy-wizard.tsx, policy-protocol-card.tsx, policy-token-row.tsx,
token-picker.tsx, role-permissions-card.tsx
- Modified: deploy-safe-card.tsx
Replace role-wide-per-token allowance buckets with per-(protocol, token) buckets so each enabled protocol owns its own spending cap on chain. Removes the cross-protocol conflict-resolution path; each protocol card in the wizard now persists independent allowances. - tokenAllowanceKey now hashes (roleKey, protocolSlug, tokenAddress) - safe_role_allowances gains a protocol_slug column; unique index swaps to (role_id, protocol_slug, token_address) - flattenInstallInput emits one allowance per (protocol, token) with no cross-protocol dedup; ConflictedToken removed from the install result and every API response - setRoleTokenAllowance + revokeRoleTokenAllowance now require a protocolSlug; revoke endpoint takes it as a ?protocolSlug= query - role + simulate route schemas updated; simulate's enforcement-level comparison repaired against the renamed "contract-allowlist" value Migration drops existing safe_role_allowances rows and the old unique index; greenfield drop authorised since no production rows exist.
…onflict UI - Extract `ProtocolTokenAllowances` so the Add-token + token-rows pattern can be reused outside the wizard. Fully controlled, all state lives in the parent. - Refactor `PolicyProtocolCard` so the chevron collapses the entire body (token rows + scoped-contracts list). Tokens persist through any combination of expand/collapse/enable/disable. Newly enabling a protocol auto-expands its card so the seeded tokens are visible. - Cap the deploy and install dialogs at 85vh with vertical scroll so the modal does not overflow the viewport. - Drop the conflict UI now that allowances are per-(protocol, token): the "Token conflicts resolved" review block, the SimulationPlan conflictedTokens field, the install-toast warning, and the obsolete bullet in the wizard's "How this works" panel. Replaces the bullet with copy explaining per-protocol independence.
…g merge Staging shipped its own 0052_agentic_wallets and 0053_sleepy_franklin_richards under the same numbers we used for Safe role tables and per-protocol allowances. Those numbers are about to collide on merge. Drop our 0052_military_photon and 0053_futuristic_living_lightning, fix the broken journal entry for idx 51 (was pointing at a tag whose .sql had been renamed to 0051_rename_aave_slug_to_v3), and let staging land cleanly. The Safe schema will be re-emitted as a single consolidated migration after the merge.
…-wallet-integration # Conflicts: # .gitignore # components/overlays/wallet/manage-tab.tsx # plugins/web3/steps/approve-token-core.ts # plugins/web3/steps/transfer-funds-core.ts # plugins/web3/steps/transfer-token-core.ts # plugins/web3/steps/write-contract-core.ts # pnpm-lock.yaml
Replaces the two branch-only migrations dropped pre-merge (military_photon + futuristic_living_lightning) with one migration that creates safe_wallets, safe_roles, safe_role_protocols, safe_role_allowances against staging's post-0064 baseline. The per-(role, protocol_slug, token_address) unique index on safe_role_allowances is part of this migration; no separate ALTER step needed since the column lands at table-creation time. Also two post-merge fixups: - lib/db/schema-agentic-wallets.ts: bigint default switched from BigInt(0) to sql\`0\`. Identical runtime semantics, but JSON-serialisable so drizzle-kit generate can emit a snapshot. Pre-existing issue surfaced while generating this migration; the staging code path that uses this column never hit it because no new generate ran. - lib/safe/execute-as-safe.ts: RpcProviderManager import path moved to @/lib/rpc/providers (canonical export site after staging's RPC refactor).
Adds 12 unit tests covering the two highest-signal pure / lightly-coupled surfaces in lib/safe/roles-orchestrator.ts (which had no dedicated tests until now): flattenInstallInput (9 tests) - empty input -> empty result - one allowance row per (protocol, token) with slug + symbol surfaced - same token across two protocols -> independent buckets, no merge - ERC-20 direct rule -> bucket with directRule populated, slug "direct" - native direct rule -> tokenAddress = NATIVE_TOKEN_SENTINEL - ERC-20 direct rule with null tokenAddress -> silently skipped - token addresses normalized to lowercase storage form - humanToWei sub-decimal precision (0.000001 USDC == 1 wei-equivalent) - symbol set semantics across protocols + direct rules getSafeRole (3 tests) - DB hit returns the role row - DB miss returns null - revoked roles surface unchanged (status filtering is the caller's job) Out of scope (deliberately) and filed as a follow-up: - installRolesWithInitialConfig - updateRolesConfig - setRoleTokenAllowance / revokeRoleTokenAllowance - reconcileSafeRoleFromChain - getSafeRoleWithBackfill These are contract-write or chain-read paths whose real risks (calldata correctness, on-chain state interpretation, probe-set coverage) are not catchable with mocks. Anvil-backed integration tests are the right tool for them.
…odifier branch
Adds a Sepolia-fork integration test surface for lib/safe/roles-orchestrator
that exercises real Safe v1.4.1 + Zodiac Roles bytecode on chain. The unit
tests cover pure transformations; these prove the orchestrator's calldata
and chain-state interpretation agree with the actual contracts.
Infrastructure:
- New docker-compose service `test-anvil-fork` (port 8547) running
`anvil --fork-url <public Sepolia RPC>` --chain-id 11155111. Fork URL is
ANVIL_FORK_URL-overridable for private archive nodes.
- tests/e2e/vitest/safe-fork-helpers.ts: fork connection helper, anvil
EOA, and `deployFreshSafe(wallet)` raw deployer using the production
buildSetupCalldata + buildCreateProxyCalldata builders (so a successful
deploy proves those builders agree with Safe v1.4.1 bytecode on chain).
Test (a) bare-Safe reconcile:
- Deploys a fresh Safe (owner = anvil EOA, threshold 1) on the fork.
- Mocks DB at the function-call seam (orchestrator's findRoleForSafe etc.
return controlled fixtures) so the test focuses on chain interactions.
- Mocks initializeWalletSigner to return an ethers.Wallet bound to the
fork (bypassing Turnkey) -- the orchestrator treats the signer opaquely.
- Mocks getRpcUrlByChainId so the orchestrator's internal RPC manager
resolution targets our fork rather than the public Sepolia RPC.
- Asserts reconcileSafeRoleFromChain returns
{ success: true, installed: false, reason: "no-roles-modifier" }.
Tests skip when SKIP_INFRA_TESTS=true or the fork is unreachable,
matching the existing pattern in tests/e2e/vitest/*. Local runs require
`docker compose --profile test up -d test-anvil-fork`.
Follow-up commits on this branch will add:
- (b) install + reconcile round-trip with a real Roles modifier deployment
- (c) update / setAllowance / revoke flows
- CI: bring up test-anvil-fork in e2e-tests-ephemeral.yml
Adds test (b): full install + reconcile round-trip against the Sepolia fork. The orchestrator's `installRolesWithInitialConfig` now runs end-to-end against real Safe v1.4.1 and Zodiac Roles v2.1 bytecode: - deploys a Roles modifier proxy via moduleProxyFactory - enables it on the Safe via owner-signed execTransaction - scopes the modifier with defi-kit's permission planner - sets initial allowances This proves the orchestrator's calldata builders agree with the real contracts on chain. Mocks don't simulate the permission planner or the modifier's scope/allowance setters; both have to actually accept the bytes we send. Additional infrastructure on the test file: - Stub nonce manager (the real one needs DB locks via SELECT FOR UPDATE and tracks pending_transactions rows; we feed monotonically-increasing nonces from chain state instead). The stub's startSession reads the current pending nonce from the fork so concurrent writes from this test don't collide with each other. - Sequenced dbSelectMock fixtures for the install flow's read order: findSafeForOrg -> safe row, findRoleForSafe -> [] (no existing). - dbInsertReturning is fed a synthetic safe_roles row for the upsert return value, since the install path reads role.id downstream. Test duration: ~30-45s for both (a) and (b) including Safe deploy, modifier proxy deploy, enableModule tx, scope tx, and allowance txs on the fork. Skipped when SKIP_INFRA_TESTS or fork unreachable.
The orchestrator integration tests (safe-roles-orchestrator-fork.test.ts) need a Sepolia-forked anvil reachable at localhost:8547. Adds a step that brings up the docker-compose `test-anvil-fork` service before `pnpm test:e2e:vitest` runs, with a 60s health poll on eth_chainId so the suite doesn't race a half-warm fork. The test file skips when the fork is unreachable (matching the existing SKIP_INFRA_TESTS pattern), so workflows without this step still pass the overall e2e job -- they just don't exercise the orchestrator-on-fork tests. With this CI step the tests run on every PR that hits the e2e-vitest-ephemeral job. Also pins the missing `chains` schema export in the test file's @/lib/db/schema mock (the gas strategy reads chain config at test time) and turns the where-clause mock into a thenable so list-shaped queries (no .limit()) resolve correctly. biome-ignore on the thenable is load-bearing -- drizzle's query builder is intentionally thenable and this mock matches that shape.
…it-tests test: KEEP-552 add unit + fork-integration tests for safe-roles-orchestrator
…-wallet-integration # Conflicts: # drizzle/meta/_journal.json
suisuss
left a comment
There was a problem hiding this comment.
Review: Zodiac Roles Modifier integration
Focused pass on the role install / scope / allowance surface. Verified against the merged code; one HIGH finding worth resolving before this lands on staging, plus a few medium/low items and follow-up suggestions.
Verified
| Claim | Where |
|---|---|
| Canonical ModuleProxyFactory (CREATE2) | lib/safe/zodiac-contracts.ts:27 |
| Singleton-via-deployModule | roles-orchestrator.ts:684-714 |
| Per-parameter presets (Aave V3, Compound V3, CoW, Uniswap V3, Spark, Lido, Rocket Pool, Balancer V2, Wrapped Native) | condition-templates.ts:255-313 via defi-kit |
| Contract-allowlist presets | protocol-registry.ts entries with templateSlug: \"target-only\" |
| ERC-20 direct rule condition | roles-orchestrator.ts:246-255 — calldataMatches([eq(counterparty), withinAllowance(allowanceKey)], [\"address\", \"uint256\"]) |
| Per-rule allowance key | zodiac-contracts.ts:136-148 — keccak256(roleKey ‖ keccak256(\"direct:<kind>:<lowercase-counterparty>\") ‖ getAddress(token)) |
HIGH — native-transfer caps are silently disabled
buildDirectRulePermission (roles-orchestrator.ts:218-234) has a conditional etherWithinAllowance: allowanceKey branch for native rules, but the install site at line 739-747 hard-codes allowanceKey = null for native:
const allowanceKey =
rule.kind === \"native-transfer\" || !rule.tokenAddress
? null
: directRuleAllowanceKey(...);So the etherWithinAllowance branch is unreachable in production. Meanwhile:
flattenInstallInputstill emits an allowance row for native rules (usingNATIVE_TOKEN_SENTINELas the token)- The per-allowance loop at line 819-858 calls
setAllowanceon chain for that bucket - But no condition reads from it
Net effect: a native-transfer direct rule with "10 ETH / week" cap will store the cap on chain as a (key, refill, period) bucket and enforce only target-only at the role level — i.e., the role can send any amount as long as the recipient matches. The wizard's per-rule cap for native rules is collected and persisted but not enforced.
This is either a code bug (line 739 should compute the native key and pass it through) or a UI bug (wizard shouldn't collect a cap until the EtherWithinAllowance path is fully wired). Pick one and fix the dead branch accordingly.
MEDIUM — defi-kit is caret-pinned
\"defi-kit\": \"^2.26.9\",
\"zodiac-roles-deployments\": \"3.3.0\",
\"zodiac-roles-sdk\": \"3.4.1\"zodiac-roles-* are exact (good). defi-kit is ^2.26.9 — a minor release can change the audited presets (e.g., aave_v3.deposit(...)) silently. The entire security boundary of per-parameter mode IS what those presets emit. Suggest exact pinning so upgrades become explicit code review with diff against new preset outputs.
- \"defi-kit\": \"^2.26.9\",
+ \"defi-kit\": \"2.26.9\",MEDIUM — "contract-allowlist" mode is coarser than the name suggests
buildTargetOnlyTemplate (condition-templates.ts:221) lets the role call any function on the protocol's target addresses. Concretely under Morpho Blue's target-only mode the role can call:
supply,borrow,withdraw— expectedcreateMarket,setFeeRecipient,enableLltv— likely not expected by the admin who toggled "Morpho Blue" on
The code is honest about this ("Target-level allowlist for protocols without per-parameter templates"), but the wizard's UX should make the asymmetry explicit. Suggest a different badge / explainer for target-only protocols, e.g. "allows all functions on listed contracts; per-function scoping ships in a future release."
LOW — roleKey derivation packs orgId as raw string
ethers.solidityPacked(
[\"string\", \"string\", \"string\", \"uint256\", \"string\"],
[\"kh-role:\", organizationId, \":\", BigInt(chainId), \":automation\"]
)solidityPacked does not length-prefix strings. UUID v4 orgIds are safe (no :), but if a future serializer change introduces orgIds with : in them, two different (orgId, chainId) pairs could pack to the same bytes. Cheap defensive fix:
ethers.keccak256(
ethers.solidityPacked(
[\"string\", \"bytes32\", \"uint256\"],
[\"kh-role:\", ethers.id(organizationId), BigInt(chainId)]
)
)LOW — counterparty case-folding inconsistency
directRuleAllowanceKey lowercases the counterparty before hashing into the string digest; the token in the same function is checksum-normalized via getAddress. Internally consistent today, but if another caller reconstructs the key using ethers.getAddress(counterparty) they'd get a different digest. Worth a docstring on directRuleAllowanceKey noting that counterparty is lowercased before hashing.
Worth surfacing (intentional, not a bug)
The role-holding EOA can bypass the modifier. ownerAddress (the Turnkey EOA) is assigned the role AND is the Safe's threshold-1 owner. It can:
- Call
rolesModifier.execTransactionWithRole(...)— gated by the role's scope + allowances - Call
safe.execTransaction(...)directly — bypasses the modifier entirely
This is the intended design (the EOA needs to be able to recover the Safe), but it means the role is policy, not boundary. A compromised Turnkey EOA can drain the Safe via path (2). Worth being explicit about this in user-facing language so readers don't assume the modifier is enforcing absolute limits.
The hardening path (if/when desired): multi-owner Safe with threshold>1. The schema (safe_wallets.owners jsonb, threshold integer) already supports this; the install wizard sets threshold=1.
What's well-designed
- Modifier ownership = avatar = target = Safe address. Puts the Safe in control of its own modifier configuration. The EOA can use the role but cannot widen it. Clean threat model.
- Single MultiSend for enableModule + assignRoles + setDefaultRole + scope + setAllowance. Atomic install — either everything lands or nothing does.
- Per-rule allowance keys.
(roleKey, kind, counterparty, token)quadruple isolates buckets correctly. "approve to A for USDC" and "transfer to B for USDC" do not share a cap. Verified by reading bothdirectRuleAllowanceKeyand the per-allowance setAllowance loop. - Reliance on defi-kit's audited presets for per-parameter mode. These are reviewed code shared across Karpatkey, Gnosis Guild, and others.
parseModuleProxyCreationEventfilters byfactoryAddressbefore parsing — cannot be fooled by a spoofed event from a malicious contract in the same tx.
Suggested follow-up tickets
- Decide what to do with native-transfer caps (HIGH finding) — wire the EtherWithinAllowance path through end-to-end, or remove the dead branch + DB bucket + wizard control.
- Pin
defi-kitexactly (one-linepackage.jsonchange). - Wizard UX clarity for target-only protocols (different badge / language).
- Document the EOA-bypass property in user-facing Safe wizard help text.
Items 1 and 2 are small. 3 and 4 are UX/comms work.
suisuss
left a comment
There was a problem hiding this comment.
Review: Recovery + sync
Focused pass on reconcileSafeRoleFromChain, the GET-path auto-backfill, and the "Refresh from chain" UI affordance.
Verified
| Claim | Where |
|---|---|
reconcileSafeRoleFromChain rebuilds DB cache from on-chain state |
roles-orchestrator.ts:2564+ |
getModulesPaginated enumerates Safe modules |
zodiac-roles.ts:377-405 |
| Zodiac subgraph for direct-rule details | roles-orchestrator.ts:2832 via zodiac-roles-sdk fetchRole |
| Probes per-(token, kind, counterparty) for direct rules | buildReconcileProbeSet + extractDirectRulesFromChain (roles-orchestrator.ts:2455+ and :2332+) |
| Marks role revoked when no modifier enabled | roles-orchestrator.ts:2587-2594 — update(safeRoles).set({status: \"revoked\"}) when findRolesModifierForSafe returns null |
| "Refresh from chain" button | components/safe/role-permissions-card.tsx:195 (syncFromChain) wires to POST /api/user/safe/[safeId]/role/reconcile |
| Auto-backfill on GET when Safe deployed + signing active but no DB role | getSafeRoleWithBackfill (roles-orchestrator.ts:2173+) and the signer-resolver probe (signer-resolver.ts:60-100, the workflow-write twin path) |
Findings
HIGH — subgraph outage during update can silently delete direct rules
The reconcile path's direct-rule discovery has two distinct sources of truth depending on what's being rebuilt:
- allowance buckets (caps, period, refill): probed directly on chain via
readRoleAllowance(modifier, key)for each candidate key. Chain is source of truth ✓ - direct rule details (
kind,counterparty): only the Zodiac subgraph viafetchRole(...). On subgraph outage the comment atroles-orchestrator.ts:2829-2832says it's swallowed and the step is skipped
The skip path returns [] from extractDirectRulesFromChain. That on its own would just mean "no direct rules surfaced in the UI this reconcile" — degraded but not destructive.
The compound risk is updateRolesConfig: that function calls reconcileSafeRoleFromChain internally to read "current" state before computing the diff to apply on chain. If the subgraph is unreachable during an update:
- Reconcile completes, but
safeRoleDirectRulesis empty (subgraph skipped) updateRolesConfigreadscurrentDirectRuleRows = listRoleDirectRules(role.id)→ empty- The desired-role build does not include those direct rules
- The diff against the on-chain scope removes their target/function permissions
- MultiSend lands, modifier loses the direct-rule scope
Net: a subgraph outage during update silently removes direct rules from the modifier's on-chain scope. The cap buckets stay (set on chain by the next setAllowance pass), but the conditions that gate them are gone.
Fix shape options:
- (a)
updateRolesConfigrefuses to proceed if subgraph is unreachable / returns empty when the modifier on chain has functions matching the ERC-20 selectors. Force user to retry when subgraph is back. - (b) Decode the modifier's scope tree from chain (via the modifier's view methods, not the subgraph) and reconstruct direct rules from there. Heavier but removes the subgraph as a write-path dependency.
- (c) Persist direct-rule rows in a way that survives reconcile-from-empty-subgraph — i.e., only delete direct rules from DB when chain explicitly confirms they're gone, not when the subgraph couldn't be reached.
I'd lean (a) for KEEP-177's timeline (smallest change) with a follow-up ticket for (b) or (c) as a hardening pass.
MEDIUM — reconcile's chain reads bypass executeWithFailover
reconcileSafeRoleFromChain constructs a single-armed RpcProviderManager at roles-orchestrator.ts:2568-2572 then reads chain state via the raw provider:
const rpcManager = await getRpcProviderFromUrls(rpcUrl, undefined, safe.chainId);
const provider = rpcManager.getProvider();
const enabledModules = await readEnabledSafeModules(provider, safe.safeAddress);
const rolesModifierAddress = await findRolesModifierForSafe(provider, ...);This is inconsistent with the rest of the post-KEEP-551 read paths and inconsistent with the signer-resolver's probeRolesModifierFromChain (which is routed through executeWithFailover — see the local commit aa040b67 "route Safe role-modifier chain probe through RPC failover").
Practical effect: a transient primary-RPC outage during "Refresh from chain" or during workflow-write GET auto-backfill returns a failure / null where a failover-aware probe would have succeeded.
Fix: route the two probe reads (readEnabledSafeModules, findRolesModifierForSafe) and the per-allowance readRoleAllowance calls through rpcManager.executeWithFailover(p => p.X), and pass a fallbackRpcUrl to getRpcProviderFromUrls (the chain row has it; signer-resolver does this already). Same pattern, two functions.
MEDIUM — delegateAddress is rebuilt from the current org wallet each reconcile
const ownerWallet = await getOrganizationWallet(safe.organizationId);
const ownerAddress = normalizeAddressForStorage(ownerWallet.walletAddress);
// ...
{
delegateAddress: ownerAddress,
// ...
}If the org rotates its EOA between install and reconcile, the DB row writes the new wallet address as the delegate. On chain, the role's assignRoles was bound to the old delegate, so:
- DB says: "delegate = X"
- Chain enforces: "delegate = Y"
- Workflow writes use
initializeWalletSigner(resolves to current EOA = X) → modifier rejects because X doesn't hold the role
Wallet rotation is rare but not impossible (Turnkey wallet recovery, org migration). Worth fixing by reading the delegate from chain — either from a RolesModifier.AssignRoles event log or from the modifier's members(roleKey, address) mapping — rather than reconstructing it from the current DB org wallet.
LOW — probe set is bounded by KeeperHub's catalog + existing DB rows
buildReconcileProbeSet builds candidates from:
- Catalog protocols × per-protocol default tokens
- Synthetic
\"direct\"slug × well-known tokens on this chain - Existing DB rows (preserves custom tokens)
A modifier configured outside the wizard (someone enabled a Roles modifier via safe.global or a script and gave it scope for a custom protocol or a non-well-known token) would have on-chain allowance buckets the probe never tries to read. Those buckets stay on chain, the DB sees nothing.
This is probably acceptable — KeeperHub only manages modifiers it installed itself. But the in-code comment says reconcile "closes the silent-bypass hole that previously downgraded these workflows to unscoped safe.execTransaction", which is true for the routing decision in signer-resolver but NOT for the DB cache itself. Worth being explicit that the cache is rebuilt from our catalog's view of the world, not from on-chain ground truth. Users will only ever see what we thought to probe for.
LOW — MAX_PAGES = 10 × PAGE_SIZE = 100 caps modules at 1000
readEnabledSafeModules (zodiac-roles.ts:377-405) hard-caps at 1000 enabled modules per Safe. That's plenty for normal Safes; only weird power-user configurations would hit it. Worth raising the cap or removing it (the iteration is bounded by the next == 0x0 sentinel anyway) so we don't silently miss modules on a Safe that has many enabled. Cheap defense.
NOTE — "Refresh from chain" / install race window is small but non-zero
If a user clicks "Refresh from chain" while an install or update's MultiSend has just confirmed on chain but the orchestrator's DB write is still in flight, the reconcile observes chain-installed + DB-not-installed, runs upsert via (safe_wallet_id, role_type) unique index. The install's subsequent transaction then upserts again. Order-of-writes determines which row "wins." No duplicate rows (good), but field values from the slower-to-commit transaction overwrite the faster one. Likely benign in practice; worth knowing exists.
What's well-designed
- Revocation on no-modifier-enabled correctly distinguishes "never had a role" from "role was revoked." Only flips status to
revokedif there's an existing non-revoked row — no spurious DB writes on first-time probes of a vanilla Safe. - API route authorization is layered correctly:
validateSafeAdminchecks the caller's role, thengetSafeForOrg(safeId, organizationId)confirms the Safe belongs to the caller's org. No IDOR. - Subgraph integration is best-effort with logged failures — outage doesn't break the rest of reconcile. The cap-vs-detail separation (chain for caps, subgraph for kind/counterparty) is the right line — finding #1 is about what happens downstream of the failure, not the failure handling itself.
- Pagination is correct for ordinary scale. Cursor sentinel + zero-check is the documented Zodiac pattern.
- GET-path backfill is idempotent via
(safe_wallet_id, role_type)unique index — concurrent backfills cannot produce duplicate role rows.
Suggested follow-up tickets
- Subgraph-outage destructive-update (HIGH) — refuse to update if direct rules can't be re-fetched, or decode from chain.
- Route reconcile reads through
executeWithFailover(MEDIUM) — consistency with the rest of post-KEEP-551 read paths. - Source
delegateAddressfrom chain (MEDIUM) — read frommembers()mapping orAssignRolesevent, not from current org wallet. - Raise / remove
MAX_PAGEScap (LOW) — sentinel-based termination is already correct.
suisuss
left a comment
There was a problem hiding this comment.
Review: Plugin coverage
Focused pass on the four write-path step files and verification that read-only steps don't accidentally route through the resolver.
Verified
| Claim | Where |
|---|---|
Four write steps route through resolveSignerMode |
transfer-token-core.ts:308, write-contract-core.ts:245, transfer-funds-core.ts:168, approve-token-core.ts:209 |
Three-mode dispatch (eoa / safe / safe-role) |
Each of the four files has the same if (safe-role) … else if (safe) … else (eoa) block around lines 274/376/394 of the respective files |
ERC-20 balance preflight reads balanceOf(safeAddress) in safe/safe-role |
transfer-token-core.ts:436-439: tokenHolderAddress = signerMode.kind === \"safe-role\" || signerMode.kind === \"safe\" ? signerMode.safeAddress : signerAddress |
| Sponsorship (ERC-4337) gated to EOA mode | All four files have signerMode.kind === \"eoa\" in the sponsorship eligibility check (the comment explicitly cites the 4337 bundler's msg.sender swap as the reason) |
| Read-only / transformation steps bypass the resolver | grep of read-contract*, check-balance*, check-allowance*, check-token-balance*, query-events*, query-transactions*, get-transaction*, decode-calldata*, batch-read-contract*, assess-risk* returns no resolveSignerMode or Safe-executor imports ✓ |
The balance-check fix in transfer-token-core is real and load-bearing — the previous code checked the EOA's balance, which is always near-zero in Safe mode and would have produced false-positive "insufficient balance" failures before the tx ever simulated against the Safe.
Findings
MEDIUM — preflight balance check exists only for transfer-token, not transfer-funds
transfer-token-core.ts:436-449 reads balanceOf(tokenHolder) and short-circuits with a clean error if insufficient. transfer-funds-core.ts (native ETH) has no analogous provider.getBalance(safeAddress) check. The orchestrator's estimateGas will still fail downstream, but the error message a user sees is much further from the cause ("transaction reverted" with cryptic revert data rather than "Safe has 0.4 ETH, you tried to send 10").
This is a pre-existing gap, not a regression from this PR — flagging because it's the natural completion of the same fix the description highlights. Suggest a follow-up to add a provider.getBalance(holderAddress) preflight in transfer-funds-core mirroring the transfer-token-core pattern (holder = signerMode.safeAddress in safe/safe-role, else EOA).
MEDIUM — Safe-routing block is duplicated across four files
Each of the four cores hand-codes the same three-arm dispatch:
if (signerMode.kind === \"safe-role\") {
receipt = await executeContractCallAsRole(signer, {
safeAddress, delegateAddress, rolesModifierAddress, roleKey,
contractAddress, abi, functionKey, args,
}, session, { chainId, workflowId, rpcManager });
} else if (signerMode.kind === \"safe\") {
receipt = await executeContractCallAsSafe(signer, {
safeAddress, ownerAddress, contractAddress, abi, functionKey, args,
}, session, { chainId, workflowId, rpcManager });
} else {
receipt = await adapter.executeContractCall(...);
}The boilerplate is ~25-30 lines per file × 4 files. Concerns:
- Drift risk: a change to one site (e.g. plumbing a new option through ExecuteAsSafeOptions) needs to be replicated in three more, with no compile-time check that it was.
- New-plugin onboarding: a fifth write plugin (e.g. a swap step) needs to copy this same block. Easy to forget a field, easy to forget
workflowId, easy to forget to threadrpcManager.
Suggest extracting a thin helper, e.g.:
// lib/safe/route-write.ts (or similar)
export async function executeWriteRoutedBySigner(
signer: ethers.Signer,
signerMode: SignerMode,
call: { contractAddress; abi; functionKey; args; value? },
session: NonceSession,
options: { chainId; workflowId?; rpcManager: RpcProviderManager;
adapter?: ChainAdapter; gasOverrides?; ... },
): Promise<TransactionReceipt>Each core file's call shrinks to ~5 lines. Not blocking — file as a refactor follow-up.
LOW — read-only steps have their own RPC failover story; worth a one-line note in plugins/CLAUDE.md
The read-only steps correctly bypass resolveSignerMode, but they do their own RPC reads — whether each routes through rpcManager.executeWithFailover vs raw provider was outside my review scope and is potentially uneven. Not flagging individual files here; just noting that the convention "writes go through the resolver, reads stay raw" is enforced by code-pattern rather than by lint or type. A one-liner in plugins/CLAUDE.md ("writes that may run from a Safe must route through resolveSignerMode; reads don't, but they should still go through rpcManager.executeWithFailover not the raw provider") would make the convention durable for future plugins.
LOW — NonceConflictError from the helper surfaces as a generic step error at the plugin layer
After KEEP-551, the Safe executors throw NonceConflictError when broadcast failed AND on-chain has no trace of the signed hash. The four core files catch only via withStepLogging's generic handler. So:
- The workflow executor sees "step failed"
- It retries the step
- Retry resolves: new nonce acquired, fresh sign + broadcast, usually succeeds
- The user-facing log says nothing about why the first attempt failed
The system recovers, but the cause is hidden. Worth catching NonceConflictError specifically at the plugin layer and surfacing a more informative message ("transaction was lost due to nonce conflict — retrying"). Tiny UX win, no correctness gain. File as a follow-up.
NOTE — signing toggle off → EOA fallback shows EOA balance in error messages
When isSigningActive = false on a Safe row, resolveSignerMode returns kind: \"eoa\". The balance preflight then correctly uses signerAddress (EOA) at line 436-439. If the user toggles signing off while their Safe is funded but their EOA is empty, they'll see "Insufficient X balance: have 0, need …" referring to the EOA, not the Safe. That's technically correct (the EOA is what's being charged when signing is off) but could surprise users who expected "signing off" to mean "sign locally, don't use Safe" rather than "use EOA's balance." Not a code issue; a UX clarification worth making in the wizard's signing-toggle copy.
What's well-designed
- Three-arm dispatch is explicit and consistent. Each branch builds the right request shape for its executor; no silent fallthrough between modes. The compiler enforces that all three
signerMode.kindvalues are handled because the type union is\"eoa\" | \"safe\" | \"safe-role\". - ERC-4337 sponsorship correctly skipped in Safe modes. The comment cites the right reason (bundler swaps
msg.senderto its own smart account); the explicitkind === \"eoa\"gate prevents this from regressing silently if someone re-orders code. - Token holder for balance check correctly follows mode. The fix from EOA to Safe address makes the preflight error actually match the on-chain reality.
- Read-only / transformation steps don't import
resolveSignerMode. Verified across 10 read-side files; clean separation between read and write surfaces, and crucially no DB read fromsafe_walletson hot read paths.
Suggested follow-ups
- Add
provider.getBalance(holderAddress)preflight totransfer-funds-core(mirrors thetransfer-token-corepattern, ~15 lines). - Extract
executeWriteRoutedBySigner(signer, signerMode, call, session, options)to dissolve the duplicated three-arm dispatch. - Add
NonceConflictError-specific catch at the plugin layer for better error messages on retry. - One-line convention note in
plugins/CLAUDE.md: writes must route throughresolveSignerMode; reads stay raw but should userpcManager.executeWithFailover.
None blocking — these are refactor / UX polish items, not correctness gaps.
Addresses the review findings on #923 across the three pass review. HIGH — native-transfer caps wired through (review #923-r1) Native rules were collecting a per-period value cap in the wizard, emitting an on-chain allowance bucket, AND building a target-only Permission whose EtherWithinAllowance branch never executed because every install/update call site hardcoded `allowanceKey = null` for native. The bucket existed; no condition pointed at it. Add a buildDirectRuleAllowanceKey helper that uses NATIVE_TOKEN_SENTINEL for native rules (matching what flattenInstallInput already stores in the allowance loop) and replace the three call sites. HIGH — refuse role updates on subgraph outage (review #923-r2) updateRolesConfig runs reconcileSafeRoleFromChain before computing the SDK diff. Direct-rule details (kind, counterparty) come from the Zodiac subgraph; on outage the original code returned [] which the diff interpreted as "those rules no longer exist on chain", building a MultiSend that stripped their scope from the modifier. Change extractDirectRulesFromChain to a discriminated result, treat BOTH `subgraph_unreachable` and `subgraph_returned_null` (indexer lag) as "cannot tell what the modifier holds", carry the flag on ReconcileSafeRoleResult, and refuse the update at the orchestrator level when the diff would touch any direct rule. The benign no-direct-rules-anywhere case still proceeds. Bail predicate extracted to shouldBailOnSubgraphOutage + a stable SUBGRAPH_OUTAGE_UPDATE_ERROR constant so the bail path is unit- tested without a heavy mock harness. MEDIUM — defi-kit exact-pin defi-kit emits the audited per-parameter presets that ARE the security boundary in per-parameter mode. A minor release can change preset shapes silently; pin to 2.26.9 so upgrades become explicit code review against the new outputs. MEDIUM — route reconcile + signer-resolver probe through executeWithFailover Read-only chain probes in reconcileSafeRoleFromChain and probeRolesModifierFromChain were using a single-armed raw provider. A flaky primary RPC during workflow-write GET backfill or "Refresh from chain" returned a failure where a failover-aware probe would have succeeded. Wrap readEnabledSafeModules, findRolesModifierForSafe, and readRoleAllowance in rpcManager.executeWithFailover and pass the chain's fallback URL through getRpcProviderFromUrls. MEDIUM — preserve delegateAddress on reconcile Reconcile was rebuilding delegateAddress from the current org wallet on every upsert; an EOA rotation between install and a later reconcile silently rewrote the DB column away from the chain-bound delegate. Keep ownerAddress for the first-install insert; drop delegateAddress from the onConflictDoUpdate set so existing rows keep what assignRoles actually granted on chain. MEDIUM — native balance preflight in transfer-funds + chain-symbol Mirror the transfer-token-core preflight: read provider.getBalance on the funding-holder address (Safe in safe/safe-role modes, EOA otherwise) before invoking the orchestrator. Surface the chain's native symbol on the insufficient-balance error message via a lazy chains-table lookup on the unhappy path. RPC failure during the preflight read surfaces as a step error to stay consistent with transfer-token-core's pattern. LOW — robustness notes + small hardening - Counterparty case-folding docstring on directRuleAllowanceKey so external callers reconstructing the key know to lowercase first. - orgAutomationRoleKey: speculative UUID-v4 collision risk noted in docs but NOT fixed; changing the derivation orphans every Safe's on-chain assignRoles since they were bound to keys derived under the current packing. Migration would need a coordinated re-assign. - readEnabledSafeModules MAX_PAGES raised from 10 to 1000 and the overflow branch now throws instead of silently truncating, so a malformed Safe with a non-terminating module linked list cannot silently miss modules. Tests Added 6 unit tests in safe-direct-rule-permissions.test.ts: the etherWithinAllowance wiring assertion for native rules, and 5 cases for shouldBailOnSubgraphOutage covering existing-only / desired-only / both / neither / error-message-shape. 16/16 in the file pass.
Addresses the user-facing language gaps Jacob flagged on PR #923: - Sender toggle (compact tooltip in safe-signing-toggle.tsx): the off-state line now says "funds come from the EOA's balance and msg.sender is the EOA" so a user toggling off doesn't get surprised by an EOA-balance error when the Safe is funded. The on-state line mirrors the same explicit symmetry. - Policy wizard tooltip (policy-wizard.tsx): contract-allowlist description tightened to "allows any function on the listed contracts; the cap is the only protection" so admins enabling a contract-allowlist protocol know it doesn't pin function arguments. Added an owner-bypass caveat bullet: at threshold 1 the Turnkey EOA can still sign safe.execTransaction directly and bypass the modifier; treat policies as workflow-scoping, not an absolute spending boundary.
…ention - docs/wallet-management/safe.md (new): user-facing page covering the KeeperHub-managed Safe stack — deploy flow, Sender toggle (workflow signer routing), Zodiac Roles policies (per-parameter vs contract- allowlist, direct rules, per-rule allowance buckets), the threshold-1 owner-bypass caveat, refresh-from-chain semantics, withdrawals, and per-Safe token tracking. Registered in wallet-management/_meta.ts and linked from wallet-management/index.md. - lib/metrics/METRICS_REFERENCE.md: added the Safe Wallet Metrics section (deploy / role_install / tx / withdraw counters + histograms) documenting the series emitted from lib/metrics/instrumentation/ safe.ts and consumed by the Grafana alerts. - plugins/CLAUDE.md: short "Signer Routing + RPC Failover" note so the next person writing a 5th web3 write plugin doesn't forget the three-arm dispatch (resolveSignerMode -> safe-role / safe / eoa) and silently sign from the EOA when the user has routed writes through the Safe. Reads stay raw but should still go through executeWithFailover.
Review: DB schema + migration (
|
| Table | token_address |
Native-rule representation |
|---|---|---|
safe_role_allowances |
text NOT NULL |
NATIVE_TOKEN_SENTINEL (snapshot confirms notNull=true) |
safe_role_direct_rules |
text (nullable) |
NULL (snapshot confirms notNull=false) |
The tokenAddress ?? "" coalescing on roles-orchestrator.ts:1710,1749 is a tell — readers of one table need to remember which convention applies. Picking the sentinel for both is consistent with the HIGH fix above and removes the asymmetry.
MEDIUM — status text columns have no CHECK / enum constraint
Four columns with text defaults but unconstrained values:
| Column | Default | Code-side valid values |
|---|---|---|
safe_wallets.status |
'pending' |
pending, deploying, deployed, failed |
safe_roles.status |
'active' |
active, revoked |
safe_role_protocols.status |
'allowed' |
allowed, revoked |
safe_role_direct_rules.status |
'allowed' |
allowed, revoked |
A typo at any write site ("REVOKED", "revoked ") inserts silently and breaks the WHERE status = 'revoked' predicate reconcileSafeRoleFromChain uses to flip the row. Either a PG enum type per column or a CHECK (status IN (...)) per table catches this at insert time. Hygiene, not a current bug — schedule for a follow-up migration.
MEDIUM — wei columns are text with no numeric validation
max_refill_wei, refill_wei, last_chain_balance_wei, amount_human are all text. Text is the right choice over int8 (uint256 overflows int64), but text permits any string. A serialization regression elsewhere could insert "undefined" or "1e18" (which BigInt() rejects) and you'd discover it at read.
Two cheap defenses, in increasing strictness:
CHECK (max_refill_wei ~ '^[0-9]+$')per column.- Drizzle
numeric→ PGnumeric(78,0). Arbitrary precision, validated on insert. Mature ETH-adjacent ORMs (Ponder, Rindexer) default to this.
Hygiene, follow-up.
LOW — owner_wallet_id ... ON DELETE set null silently loses delegate context
safe_wallets.owner_wallet_id references organization_wallets (legacy table name para_wallets) with ON DELETE set null. If the org's wallet row is replaced by delete-and-recreate rather than update (e.g., a provider migration), the Safe's owner_wallet_id is NULLed silently. On-chain ownership is unchanged, so the chain still knows the original owner — but DB and chain now disagree, with no read-time reconciliation in safe_wallets (unlike safe_roles.delegate_address, which the prior recovery review already flagged).
Either tighten to ON DELETE restrict (rare event; better visible) or resolve from chain when NULL. Not blocking.
What's well-designed
- Cascade chain bottoms out cleanly —
safe_walletsdelete →safe_roles→{protocols, allowances, direct_rules}. No orphaned children possible from below. (safe_wallet_id, role_type)unique correctly bounds the schema to one-role-per-Safe-per-type while leaving multi-role-type as a purely additive future change.- Indexes match access patterns — every
WHERE roleId = ?/WHERE safeWalletId = ?/WHERE organizationId = ?has a covering single-column index alongside the composite unique. ALTER TABLE organization_tokens ADD COLUMN safe_wallet_idis metadata-only on PG 11+ (nullable, no default). Safe on a large existing table.
Suggested follow-up tickets
nullsNotDistinct()or native-sentinel onsafe_role_direct_rules(HIGH — small fix, in this PR). Add a unit test that re-Applies the same native-transfer rule and asserts row count stays 1.- Unify
tokenAddressnullability acrosssafe_role_*(MEDIUM — implied by Bump better-auth from 1.3.34 to 1.4.2 #1 if you take the sentinel route). - CHECK / enum on
status(MEDIUM — follow-up). numeric(78,0)for wei columns (MEDIUM — follow-up).owner_wallet_idcascade behavior (LOW — follow-up).
Items 1 and 2 are the only ones I'd block on; the rest are scheduling-before-the-tables-grow-large hygiene.
Review: API route / authorization surface (
|
| Claim | Where |
|---|---|
getSafeForOrg filters (safeId, organizationId) and returns null → 404, not "exists but wrong org" → 403 |
lib/safe/auth.ts:67-83 |
| Membership + admin/owner gate before any DB read on mutational routes | 7 of 9 routes ([safeId]/route.ts, [safeId]/role/*) |
| Body type assertions only — no Zod / Valibot | All routes ((await request.json()) as XxxBody) |
MEDIUM — three different admin-auth implementations across the Safe surface
lib/safe/auth.ts::validateSafeAdmin— admin/owner gate, used by 7 routes.app/api/user/safe/route.ts:23-58::validateAdmin(local) — admin/owner gate, duplicates the shared logic. Maintenance drift risk; the only delta is the rejection-message string.app/api/user/safe/simulate-deploy/route.ts:142-161— accepts any member, hand-rolled inline. No admin gate.
Item 3 is the substantive one: simulate-deploy is a read-only preview (no DB writes, no on-chain tx), so allowing non-admin members to see "what would happen if we deployed" is defensible. But:
- The decision is unstated. A reviewer can't tell whether non-admin access is intentional or oversight.
- It hits RPC (
getMaxFeePerGas) and a price oracle (getNativeUsdPrice,weiToUsd) per request. With no rate-limit primitive anywhere in the codebase, this is the most-abusable endpoint in the Safe surface.
Suggested: route 1 and 2 should both call the shared helper (DRY); route 3 should either also call the shared helper (closing the preview to non-admins, simplest), or get an explicit comment justifying the looser gate.
MEDIUM — destructive /role/update accepts empty body and wipes everything
app/api/user/safe/[safeId]/role/update/route.ts:106-110 documents the contract: "FULL desired state (protocols + direct rules), not a patch". Verified at roles-orchestrator.ts:1418 — flattenInstallInput(protocolInputs, directRules) with empty arrays produces an empty desired state, and the SDK diff at line 1511 (callsPlannedForApplyRole(currentRole, desiredRole)) emits revoke calls for every current protocol scope, plus allowance-zero calls for every current bucket.
The only existing guard is the subgraph-outage check at line 1378 (shouldBailOnSubgraphOutage requires existingDirectRulesCount > 0 || desiredDirectRulesCount > 0). When subgraph is healthy, there is no guard at all — an empty body wipes the modifier's entire scope.
By-design (per the doc comment), but the failure modes are:
- UI bug serializes
{}instead of{ protocols: [...current], directRules: [...current] } - Mid-request truncation
- Misinterpreted client contract during partner integration
Suggested defense-in-depth: refuse an empty desired state when the existing role is non-empty, unless the request body sets an explicit confirm: "remove-all-policies" flag. The legitimate "wipe everything" path becomes opt-in; an accidental empty body returns 400 instead of executing a destructive MultiSend.
MEDIUM — /role/allowances POST accepts arbitrary protocolSlug without catalog validation
app/api/user/safe/[safeId]/role/allowances/route.ts:96-101 checks protocolSlug is a non-empty string but doesn't validate it against PROTOCOL_CATALOG. Sibling routes (/role install, /role/update, /role/simulate, /role/simulate-deploy) all gate slugs through if (!(slug in PROTOCOL_CATALOG)). The [tokenAddress] DELETE handler also accepts arbitrary slugs.
Probably benign — setRoleTokenAllowance looks up by (roleId, allowanceKey) where allowanceKey is computed from (roleKey, protocolSlug, token), so an unknown slug just produces a non-matching key. But:
- It's a footgun: a caller passing
protocolSlug="aaveV3"(camelCase typo) silently writes a phantom allowance keyed to a slug nothing else can reach. - The asymmetry across sibling routes is a maintainability hazard — anyone copying the pattern from this file gets less validation than from the others.
Add the same slug in PROTOCOL_CATALOG check both routes are missing.
MEDIUM — silent malformed-input dropping in normalisers
normaliseDirectRules, normaliseProtocols, and the inline flatMap token normalisers all continue / return [] on malformed entries. The route response includes skipped: [] and skippedDirectRules: [], but those arrays only carry unknown protocol slugs. Token-level entries dropped due to missing tokenDecimals/periodSeconds/amountHuman, and rules dropped due to missing counterparty/kind, are filtered with no trace.
User submits 5 rules in the wizard, 3 install (because 2 had a typo somewhere upstream), response says success: true, skippedDirectRules: []. User has no signal that anything was dropped, and on the next Apply the "missing" rules silently re-disappear.
Suggested: extend skipped (or add a droppedInputs) array that surfaces each filtered entry with a reason. Failing that, reject the whole request on any malformed entry (400) so the client sees a clear validation failure rather than a partial success.
LOW — input validation precedes auth in DELETE /role/allowances/[tokenAddress]
[tokenAddress]/route.ts:17-32 validates the address and presence of protocolSlug before validateSafeAdmin at line 34. An unauthenticated probe gets a 400 with a specific message ("Invalid token address" / "protocolSlug query parameter is required"), confirming the route shape. Auth-first would return 401 and give no shape information.
Minor — doesn't grant access, only reveals API surface a determined attacker could discover anyway from JS bundles. But the pattern is inconsistent: every other route does auth first.
LOW — 404 message inconsistency
Three of the routes return "Safe not found for this organization" (good — explicit about org-scoping); three return just "Safe not found". Both are 404, neither leaks existence, but the inconsistency is cosmetic noise. Pick one.
LOW — duplicated body normalisers across install / update / simulate / simulate-deploy
TokenLimitBody, DirectRuleBody, normaliseProtocols, normaliseDirectRules, and the shape-switching logic appear in 3-4 places. The shapes are subtly different per route (install accepts both legacy and new shapes, update only new). Drift risk: a future field addition will have to be hand-applied in N places.
Extracting to lib/safe/route-input.ts would consolidate and let the normalisers carry their own validation reports for the MEDIUM "silent dropping" issue above.
What's well-designed
getSafeForOrgreturns null → 404 — no403 cross-orgleak. Pattern documented in the helper itself.validateSafeAdmingates by(session.user, activeOrgId, member.role)in that order — short-circuits cleanly on each missing piece with distinct status codes (401, 400, 403, 403).- Mutation routes log to
logSystemErrorwith structured fields (endpoint,component,chain_id) — consistent observability surface for Sentry triage. - PATCH on
/api/user/safe/[safeId]re-assertsorganizationIdin the UPDATE WHERE clause (line 62-66) — defense-in-depth against thegetSafeForOrgcheck accidentally letting through stale state. Good belt-and-braces pattern. - GET
/rolereturns{ installed: false, role: null, ... }rather than 404 when the role doesn't exist yet — correct, the Safe exists, just no role on it. UI can render the "Install" CTA without ambiguity.
Suggested follow-up tickets
- Unify admin auth — either route both deploy and simulate-deploy through
validateSafeAdmin, or document the intentional asymmetry (MEDIUM). - Empty-body guard on
/role/update— requireconfirm: "remove-all-policies"to wipe (MEDIUM). protocolSlugcatalog validation on/role/allowancesPOST + DELETE (MEDIUM).- Surface dropped-input reasons in normaliser responses (MEDIUM).
- Consolidate normalisers to a shared module (LOW — follow-up).
Items 1 and 3 are the smallest. Item 2 is the highest-value defense-in-depth.
Migration index collision: staging landed 0073_keep545_workflow_executions_error_classification at the same time this branch held 0073_safe_wallet_integration. Resolved by: - Keeping staging's idx=73 (keep545) - Renumbering this branch's migration to idx=74 (0074_safe_wallet_integration.sql) - Renaming meta/0073_snapshot.json to meta/0074_snapshot.json and patching it to include the error_category (text) and is_user_error (boolean) columns staging's 0073 adds to workflow_executions, so the 0074 snapshot reflects post-0073 schema state - Journal when timestamps remain monotonic (1777760000007 -> 1778630077869) Verified clean: pnpm drizzle-kit generate reports "No schema changes, nothing to migrate" against the merged schema. Other auto-merges (lib/db/schema.ts, lib/metrics/types.ts, package.json, pnpm-lock.yaml) were content-clean; both branches' additions survived.
PR Environment DeployedYour PR environment has been deployed! Environment Details:
Components:
The environment will be automatically cleaned up when this PR is closed or merged. |
|
Works really well @joelorzet I added a comment on the Linear ticket advising some UX changes:
|
Summary
Per-org, per-chain Gnosis Safe v1.4.1 smart accounts, owned by the org's Turnkey EOA at threshold 1, with Zodiac Roles v2.0.0 enforcing per-call policies on every workflow write. Workflow
msg.senderbecomes the Safe; the EOA still pays gas and signs but holds no funds.Three-mode signer routing
lib/safe/signer-resolver.tschooses per (org, chain):eoa— direct Turnkey EOA write (default; signing toggle off or no Safe)safe—safe.execTransactionsigned by the EOA owner.msg.sender = Safe. No policy gating.safe-role—rolesModifier.execTransactionWithRolefrom the EOA delegate. Modifier validates target + selector + parameter scope + per-period allowance bucket. The policy-enforcing path.Hot-path is DB-first; recovers from cache drift via a one-shot chain probe + background reconcile when a Safe has Roles enabled on chain but no
safe_rolesrow.Zodiac Roles Modifier integration
Per-Safe Roles proxy deployed via the canonical ModuleProxyFactory + singleton (CREATE2 deterministic). Two policy mechanisms:
Per-protocol presets — wizard offers a catalog with per-parameter or contract-allowlist enforcement:
Direct rules — per-recipient ERC-20 transfers/approves and native sends, scoped via
c.calldataMatches([c.eq(counterparty), c.withinAllowance(allowanceKey)])for ERC-20 (selector + spender/recipient + amount all bound) and target-only for native transfers.Per-rule allowance buckets: the on-chain bucket key is
keccak256(roleKey, "direct:<kind>:<counterparty>", token), so two direct rules on the same token (e.g. an approve and a transfer) get independent caps that match what the wizard's per-rule cap implies. Two transfer rules to different recipients are also independent.Wallet overlay UX
Recovery + sync
reconcileSafeRoleFromChainrebuilds the DB cache from on-chain state viagetModulesPaginated+ the Zodiac subgraph. Probes per-(token, kind, counterparty) for direct rules. Marks the rolerevokedif no modifier is enabled on chain.Chain support
Verified by direct
eth_getCodeprobe on 2026-05-06 (seelib/safe/zodiac-contracts.ts):Avalanche Fuji (43113) is excluded from
ROLES_SUPPORTED_CHAIN_IDS: the ModuleProxyFactory is deployed there but the singleton at the canonical address returns0x(no bytecode).installRolesWithInitialConfigrejects unsupported chains up front with a structured error. Plain Safe execution still works on every chain inSUPPORTED_SAFE_CHAIN_IDS(14 chains total).Plugin coverage
Four web3 step files route through the signer-resolver:
transfer-token,transfer-funds,approve-token,write-contract. The pre-flight ERC-20 balance check intransfer-token-corereadsbalanceOf(safeAddress)when insafe-role/safemode (was hitting the EOA, which never holds tokens). Read-only / transformation steps (read-contract,check-balance,query-events, etc.) bypass the resolver.Out of scope (follow-ups)
skippedDirectRules(the API now returns dropped rules, the wizard doesn't render them yet)directRuleAllowanceKeyseparation across kind/counterparty)Migrations
One migration:
0069_safe_wallet_integration. Creates 5 tables:safe_wallets,safe_roles,safe_role_protocols,safe_role_allowances,safe_role_direct_ruleswith FKs and indexes. Generated against a synthetic 0068 baseline so the diff is safe-only and lands cleanly on top of staging's 0065-0068 without re-applying their DDL.