Summary
Proposal to upgrade finance/wallet_screening from a demo-grade compliance tool into a robust, auditable Ethereum wallet screening skill suitable for agent-driven due diligence.
Scope covers four workstreams:
- Knowledge base expansion and refresh — ingest and normalize public sanctions/malicious-address datasets (OpenSanctions FTM, OFAC SDN crypto exports, FBI/NBCTF, Uniswap TRM, MEW darklist, research phishing datasets).
- Parsing and matching fixes — correct FTM
publicKey handling, unified address normalization, chain-aware matching, and use of expanded lists for both direct wallet hits and transaction counterparty screening.
- Forensic coverage — ERC-20 / internal tx support, Etherscan pagination or explicit truncation warnings, optional hybrid live checks (Chainalysis / TRM free tiers) with offline fallback.
- Report schema v2 and agent contract — stable JSON output, explicit risk tiers, evidence-backed
risk_factors[], aligned instructions.md / card.json / docs, and expanded test coverage.
This RFC follows a prior internal audit that identified schema drift, ineffective FTM matching (~0 ETH hits from 3,527 FTM entities), and thin malicious-contract tx coverage (6 contracts vs 543+ TRM entries).
Motivation
finance/wallet_screening is the flagship Skillware compliance skill and the most referenced example across README, usage guides, and agent-loop examples. Today it has the right architecture (manifest + instructions + deterministic Python + bundled data + maintenance scripts) but reliability gaps limit trust:
Data / coverage
- Bundled data: ~4,413 records, ~275 unique ETH addresses across 5 JSON files — far below docs claiming "880+ bundled lists."
entities.ftm.json (3,527 FTM entities) is loaded but does not match ETH addresses because matching logic expects addresses / properties.address, while FTM stores wallet IDs in properties.publicKey.
- Transaction malicious-interaction detection uses only
malicious_scs_2025.json (6 contracts). normalized_uniswap_trm.json (543 entries) is used for direct wallet sanctions only, not tx flow analysis.
- Data freshness: last normalized snapshots dated 2025-07-22; no documented refresh cadence or CI validation.
Correctness
- Schema drift:
instructions.md and card.json reference fields that do not exist in skill.py output (summary.sanctioned vs summary.sanctioned_entity_match, etc.), causing agents to misread reports.
- Data quality: duplicate address assigned to two mixers in
malicious_scs_2025.json; zero-width Unicode (\u200b) in 3 Israel NBCTF addresses breaks matching.
- Silent API failures: Etherscan/CoinGecko errors return empty/zero values without warnings in the report.
Industry gap
- Enterprise tools (Chainalysis Address Screening, TRM) provide direct + indirect exposure, structured identifications with source URLs, configurable severities, and continuous re-screening. Skillware should not replicate enterprise graph ML, but should adopt the reporting patterns and minimum viable exposure logic (e.g., flag interactions with known bad counterparties) within its offline-first, deterministic model.
Goal
Make the skill a credible open-source starting point for AML/sanctions due diligence agents — accurate enough to demo, extensible enough for enterprise customization, and honest about limitations.
Detailed Design
Phase 1 — Canonical data model and ingest pipeline
1.1 Unified record schema
Define skills/finance/wallet_screening/data/schema.json (or a documented Python dataclass) for all list entries.
{
"address": "0x...",
"chain": "ethereum",
"category": "sanctions|mixer|scam|phishing|stolen|market|other",
"severity": "low|medium|high|critical",
"label": "Entity or contract name",
"reason": "Human-readable reason",
"jurisdiction": "US|EU|IL|...",
"source": "OFAC|OpenSanctions|FBI|NBCTF|Uniswap-TRM|MEW|...",
"source_url": "https://...",
"tags": [],
"last_updated": "ISO-8601"
}
1.2 Normalization layer
- Centralize address normalization: lowercase, strip whitespace, remove zero-width chars (e.g.
\u200b), validate 0x + 40 hex, optional EIP-55 checksum warning.
- Refactor
maintenance/normalization_tool.py and normalize_uniswap_trm.py to emit the canonical schema above.
- Add new importers:
- OpenSanctions FTM — extract
CryptoWallet entities from properties.publicKey; split comma-separated multi-address strings; tag chain from currency / caption heuristics.
- OFAC SDN crypto-only — official XML/CSV or vile/ofac-sdn-list daily JSON releases.
- MEW darklist —
addresses-darklist.json from MyEtherWallet/ethereum-lists.
- Poison-Hunter / PTXPhish — research phishing addresses (ETH-only filter).
- Deduplicate by
(chain, address, category) with merge rules for conflicting labels.
- Exclude known false positives (e.g. burn address
0x000...dead unless explicitly sanctioned).
1.3 Storage layout
data/
canonical/
sanctions_ethereum.json # merged direct-hit list
malicious_contracts.json # tx-interaction screening list
sources/ # raw/normalized per-source snapshots (optional)
manifest.json # counts, last_refresh, source versions
1.4 Refresh runbook
- Document in
maintenance/README.md: download URLs, commands, expected counts, license notes (OpenSanctions non-commercial).
- Optional GitHub Action (weekly): run normalizers, fail if record count drops more than 10% without review.
Phase 2 — Matching and forensic engine
2.1 Sanctions matching
- Fix
_check_against_sanctions to parse FTM properties.publicKey (list values and comma-separated strings).
- Filter by
chain == ethereum before match.
- Build an in-memory index at init:
address_lower -> List[Record] for O(1) lookup.
2.2 Transaction analysis
- Merge
malicious_contracts.json plus relevant TRM/scam entries into a single risk_address_index for tx screening.
- Extend Etherscan calls:
txlist (existing) with pagination
tokentx for ERC-20 flows
txlistinternal for internal transfers
- Flag indirect exposure: top counterparties that appear in the sanctions/malicious index (lightweight Chainalysis-style exposure).
- Surface coverage metadata:
tx_analyzed, tx_total_reported_by_etherscan, truncated (boolean).
2.3 Optional hybrid live checks (manifest opt-in)
Add optional env vars:
CHAINALYSIS_API_KEY — GET https://public.chainalysis.com/api/v1/address/{addr}
TRM_SANCTIONS_API_KEY — POST TRM public screening endpoint
Behavior: run offline checks first; if keys are present, cross-check and merge into risk_details.live_verification[]. Never block offline mode when APIs are unavailable.
Phase 3 — Report schema v2
3.1 Top-level structure
{
"schema_version": "2.0",
"metadata": {
"screening_time": "...",
"wallet_address": "0x...",
"chain": "ethereum",
"data_sources": [
{"name": "...", "record_count": 0, "last_updated": "..."}
],
"warnings": ["etherscan_txlist_truncated", "coingecko_price_unavailable"]
},
"summary": {
"risk_level": "low|medium|high|critical",
"sanctioned": false,
"sanctioned_entity_match": false,
"malicious_interaction_count": 0,
"indirect_exposure_count": 0,
"balance_eth": 0.0,
"balance_usd": 0.0,
"total_transactions_analyzed": 0
},
"financial_analysis": {},
"risk_details": {
"sanctions_hits": [],
"malicious_interactions": [],
"indirect_exposures": [],
"live_verification": []
},
"network_analysis": {
"most_interacted_wallet": ["0x...", 45],
"top_counterparties": [],
"risky_counterparties": []
},
"risk_factors": [
{
"code": "SANCTIONS_DIRECT",
"severity": "critical",
"evidence": "...",
"source": "OFAC"
}
]
}
Note: financial_analysis keeps existing v1 fields (value in/out, gas, PnL in ETH/USD/EUR) plus optional token-flow fields added in Phase 2.
3.2 Risk level rules (deterministic)
| Level |
Conditions |
| critical |
Direct sanctions hit on the screened wallet |
| high |
Malicious contract interaction OR indirect exposure to a sanctioned entity |
| medium |
Interaction with scam/phishing-labeled address; no direct sanctions |
| low |
No hits; optionally note high-volume mixer-adjacent patterns in risk_factors |
Implementation sketch in skill.py:
def compute_risk_level(sanctions_hits, malicious_interactions, indirect_exposures):
if sanctions_hits:
return "critical"
if malicious_interactions or indirect_exposures:
return "high"
if any(i.get("category") in ("scam", "phishing") for i in malicious_interactions):
return "medium"
return "low"
3.3 Agent contract alignment
- Update
instructions.md to reference v2 fields only.
- Update
card.json UI schema keys to match v2 summary fields.
- Update
docs/skills/wallet_screening.md with accurate data counts and limitations.
- Optionally emit v1 aliases for one release (
summary.sanctioned mirrors summary.sanctioned_entity_match) then remove in v2.1.
Phase 4 — Tests and acceptance criteria
Tests to add
- FTM
publicKey ETH match using a known sanctioned test vector from OFAC export.
- Unicode / zero-width address normalization (NBCTF
\u200b case).
- Malicious interaction detected for Tornado Cash router address.
- Indirect exposure when a counterparty is in the sanctions index.
- Truncated tx history emits a warning in
metadata.warnings.
- Schema v2 required keys present in every successful
execute() response.
- Optional: mocked Chainalysis API merge into
live_verification.
Definition of done
Out of scope (for this RFC)
- Multi-chain screening (Solana, Bitcoin, Tron) — separate RFC.
- Full graph ML / Elliptic-style models.
- Enterprise SLA, licensing, or hosted screening service.
- Chainlink Proof of Reserve integration.
Drawbacks
Drawbacks
-
Maintenance burden — More data sources means ongoing refresh work, license tracking (OpenSanctions requires a commercial license for business use), and false-positive triage (e.g. burn address flagged in TRM lists).
-
Bundle size — Full OpenSanctions FTM is roughly 50MB. The repo may need an ETH-filtered subset committed locally, or a download-on-first-run step, which adds complexity for offline/air-gapped users.
-
API dependency risk — Optional Chainalysis/TRM keys improve freshness but introduce rate limits, key management, and external failure modes. Offline fallback must stay the default path, not a degraded afterthought.
-
Scope creep — ERC-20 flows, internal txs, and Etherscan pagination significantly increase API usage, implementation time, and test surface. Easy to turn a skill hardening task into a multi-month forensic platform.
-
Legal / liability perception — Richer reports with risk tiers and exposure flags may read as authoritative compliance clearance. Constitution and agent instructions must keep reinforcing that output is informational, not legal advice.
-
Schema breaking change — Report schema v2 may break existing agent prompts, examples, and any downstream integrations keyed on v1 field names. A deprecation period or dual-key emission adds maintenance cost.
-
Data quality variance — Community lists (MEW darklist, research datasets) differ in rigor from official OFAC/NBCTF sources. Merging them raises false-positive risk unless severity tiers and source attribution stay explicit.
-
Chain-specific limits — Staying Ethereum-only keeps scope manageable but leaves a gap vs multi-chain enterprise tools. Users screening cross-chain addresses may assume broader coverage than the skill provides.
-
Etherscan cost / limits — Paginated tx + token + internal calls can burn through free-tier API quotas quickly on high-activity wallets. May need caching, caps, or paid-tier documentation.
-
No continuous monitoring — Unlike Chainalysis/TRM enterprise products, this remains point-in-time screening. A clean result today does not detect future sanctions listing or new malicious interactions unless re-run manually.
Summary
Proposal to upgrade
finance/wallet_screeningfrom a demo-grade compliance tool into a robust, auditable Ethereum wallet screening skill suitable for agent-driven due diligence.Scope covers four workstreams:
publicKeyhandling, unified address normalization, chain-aware matching, and use of expanded lists for both direct wallet hits and transaction counterparty screening.risk_factors[], alignedinstructions.md/card.json/ docs, and expanded test coverage.This RFC follows a prior internal audit that identified schema drift, ineffective FTM matching (~0 ETH hits from 3,527 FTM entities), and thin malicious-contract tx coverage (6 contracts vs 543+ TRM entries).
Motivation
finance/wallet_screeningis the flagship Skillware compliance skill and the most referenced example across README, usage guides, and agent-loop examples. Today it has the right architecture (manifest + instructions + deterministic Python + bundled data + maintenance scripts) but reliability gaps limit trust:Data / coverage
entities.ftm.json(3,527 FTM entities) is loaded but does not match ETH addresses because matching logic expectsaddresses/properties.address, while FTM stores wallet IDs inproperties.publicKey.malicious_scs_2025.json(6 contracts).normalized_uniswap_trm.json(543 entries) is used for direct wallet sanctions only, not tx flow analysis.Correctness
instructions.mdandcard.jsonreference fields that do not exist inskill.pyoutput (summary.sanctionedvssummary.sanctioned_entity_match, etc.), causing agents to misread reports.malicious_scs_2025.json; zero-width Unicode (\u200b) in 3 Israel NBCTF addresses breaks matching.Industry gap
Goal
Make the skill a credible open-source starting point for AML/sanctions due diligence agents — accurate enough to demo, extensible enough for enterprise customization, and honest about limitations.
Detailed Design
Phase 1 — Canonical data model and ingest pipeline
1.1 Unified record schema
Define
skills/finance/wallet_screening/data/schema.json(or a documented Python dataclass) for all list entries.{ "address": "0x...", "chain": "ethereum", "category": "sanctions|mixer|scam|phishing|stolen|market|other", "severity": "low|medium|high|critical", "label": "Entity or contract name", "reason": "Human-readable reason", "jurisdiction": "US|EU|IL|...", "source": "OFAC|OpenSanctions|FBI|NBCTF|Uniswap-TRM|MEW|...", "source_url": "https://...", "tags": [], "last_updated": "ISO-8601" }1.2 Normalization layer
\u200b), validate0x+ 40 hex, optional EIP-55 checksum warning.maintenance/normalization_tool.pyandnormalize_uniswap_trm.pyto emit the canonical schema above.CryptoWalletentities fromproperties.publicKey; split comma-separated multi-address strings; tag chain fromcurrency/ caption heuristics.addresses-darklist.jsonfrom MyEtherWallet/ethereum-lists.(chain, address, category)with merge rules for conflicting labels.0x000...deadunless explicitly sanctioned).1.3 Storage layout
1.4 Refresh runbook
maintenance/README.md: download URLs, commands, expected counts, license notes (OpenSanctions non-commercial).Phase 2 — Matching and forensic engine
2.1 Sanctions matching
_check_against_sanctionsto parse FTMproperties.publicKey(list values and comma-separated strings).chain == ethereumbefore match.address_lower -> List[Record]for O(1) lookup.2.2 Transaction analysis
malicious_contracts.jsonplus relevant TRM/scam entries into a singlerisk_address_indexfor tx screening.txlist(existing) with paginationtokentxfor ERC-20 flowstxlistinternalfor internal transferstx_analyzed,tx_total_reported_by_etherscan,truncated(boolean).2.3 Optional hybrid live checks (manifest opt-in)
Add optional env vars:
CHAINALYSIS_API_KEY— GEThttps://public.chainalysis.com/api/v1/address/{addr}TRM_SANCTIONS_API_KEY— POST TRM public screening endpointBehavior: run offline checks first; if keys are present, cross-check and merge into
risk_details.live_verification[]. Never block offline mode when APIs are unavailable.Phase 3 — Report schema v2
3.1 Top-level structure
{ "schema_version": "2.0", "metadata": { "screening_time": "...", "wallet_address": "0x...", "chain": "ethereum", "data_sources": [ {"name": "...", "record_count": 0, "last_updated": "..."} ], "warnings": ["etherscan_txlist_truncated", "coingecko_price_unavailable"] }, "summary": { "risk_level": "low|medium|high|critical", "sanctioned": false, "sanctioned_entity_match": false, "malicious_interaction_count": 0, "indirect_exposure_count": 0, "balance_eth": 0.0, "balance_usd": 0.0, "total_transactions_analyzed": 0 }, "financial_analysis": {}, "risk_details": { "sanctions_hits": [], "malicious_interactions": [], "indirect_exposures": [], "live_verification": [] }, "network_analysis": { "most_interacted_wallet": ["0x...", 45], "top_counterparties": [], "risky_counterparties": [] }, "risk_factors": [ { "code": "SANCTIONS_DIRECT", "severity": "critical", "evidence": "...", "source": "OFAC" } ] }Note:
financial_analysiskeeps existing v1 fields (value in/out, gas, PnL in ETH/USD/EUR) plus optional token-flow fields added in Phase 2.3.2 Risk level rules (deterministic)
risk_factorsImplementation sketch in
skill.py:3.3 Agent contract alignment
instructions.mdto reference v2 fields only.card.jsonUI schema keys to match v2 summary fields.docs/skills/wallet_screening.mdwith accurate data counts and limitations.summary.sanctionedmirrorssummary.sanctioned_entity_match) then remove in v2.1.Phase 4 — Tests and acceptance criteria
Tests to add
publicKeyETH match using a known sanctioned test vector from OFAC export.\u200bcase).metadata.warnings.execute()response.live_verification.Definition of done
instructions.md,card.json, and docs matchskill.pyoutput.warnings[]andrisk_level.maintenance/README.mddocuments refresh instructions.Out of scope (for this RFC)
Drawbacks
Drawbacks
Maintenance burden — More data sources means ongoing refresh work, license tracking (OpenSanctions requires a commercial license for business use), and false-positive triage (e.g. burn address flagged in TRM lists).
Bundle size — Full OpenSanctions FTM is roughly 50MB. The repo may need an ETH-filtered subset committed locally, or a download-on-first-run step, which adds complexity for offline/air-gapped users.
API dependency risk — Optional Chainalysis/TRM keys improve freshness but introduce rate limits, key management, and external failure modes. Offline fallback must stay the default path, not a degraded afterthought.
Scope creep — ERC-20 flows, internal txs, and Etherscan pagination significantly increase API usage, implementation time, and test surface. Easy to turn a skill hardening task into a multi-month forensic platform.
Legal / liability perception — Richer reports with risk tiers and exposure flags may read as authoritative compliance clearance. Constitution and agent instructions must keep reinforcing that output is informational, not legal advice.
Schema breaking change — Report schema v2 may break existing agent prompts, examples, and any downstream integrations keyed on v1 field names. A deprecation period or dual-key emission adds maintenance cost.
Data quality variance — Community lists (MEW darklist, research datasets) differ in rigor from official OFAC/NBCTF sources. Merging them raises false-positive risk unless severity tiers and source attribution stay explicit.
Chain-specific limits — Staying Ethereum-only keeps scope manageable but leaves a gap vs multi-chain enterprise tools. Users screening cross-chain addresses may assume broader coverage than the skill provides.
Etherscan cost / limits — Paginated tx + token + internal calls can burn through free-tier API quotas quickly on high-activity wallets. May need caching, caps, or paid-tier documentation.
No continuous monitoring — Unlike Chainalysis/TRM enterprise products, this remains point-in-time screening. A clean result today does not detect future sanctions listing or new malicious interactions unless re-run manually.
How to contribute to this RFC
This RFC is design-only — don't ship one mega-PR. Pick one row in the tracker below (or a smaller slice of it), open a sub-issue linked to this RFC (
Parent: #115), and reference it in your PR (Refs #<sub-issue>).Suggested order: quick wins → Ph. 1.x → Ph. 2.2b → Ph. 2.3 → Ph. 3.x → Ph. 4.x (Ph. 2.1 and Ph. 2.2a are done or in flight — see tracker).
Progress tracker
instructions.md/card.json/ docs with currentskill.pyoutputdata/schema.jsonor documented datacla…