Skip to content

feat: CustomerSupportScenario — 8-fact support-ticket domain benchmark #23

@Neal006

Description

@Neal006

Background

MemoryLens scenarios simulate realistic multi-turn conversations to benchmark memory recall. The customer support domain (a user reporting a billing issue that escalates in severity and updates their account email) is highly representative of production chatbot workloads. It tests whether a memory system correctly propagates account updates mid-conversation — a common real-world failure mode.

Suggested prerequisite: Issue #22 (BaseScenario) should be merged first, but you can develop in parallel and add the inheritance once it's available.

Key files to read first:

  • simulator/scenarios/edtech.pyexact pattern to follow
  • simulator/facts.pyFact dataclass (key, value, injected_at, updated_at, updated_value)
  • main.py — how --scenario edtech import block works (duplicate this for customer_support)

What to build

simulator/scenarios/customer_support.py — new file

8 facts (2 with updates):

from simulator.facts import Fact

CUSTOMER_SUPPORT_FACTS = [
    Fact("name",              "Emma Davis",             injected_at=0),
    Fact("account_email",     "emma@example.com",       injected_at=1,
         updated_at=35, updated_value="emma.davis@gmail.com"),
    Fact("subscription_plan", "Pro",                    injected_at=2),
    Fact("issue_type",        "billing error",          injected_at=4),
    Fact("ticket_id",         "TKT-4821",               injected_at=6),
    Fact("severity",          "medium",                 injected_at=8,
         updated_at=50, updated_value="high"),
    Fact("product_version",   "3.2.1",                  injected_at=10),
    Fact("account_id",        "ACC-98765",              injected_at=12),
]

5-persona pool — different account holders (names, emails, plans, ticket IDs) covering diverse regions.

20+ domain filler turns — customer support questions like:

  • "Can you explain the difference between the Pro and Enterprise plans?"
  • "How do I download my invoice for last month?"
  • "What is your refund policy for annual subscriptions?"
  • (etc — keep them realistic and varied)

CustomerSupportScenario(BaseScenario) class wrapping the constants (add this once Issue #22 is merged).

main.py changes

Add "customer_support" to --scenario choices and add the import block:

elif args.scenario == "customer_support":
    from simulator.scenarios.customer_support import (
        CUSTOMER_SUPPORT_FACTS, CUSTOMER_SUPPORT_FILLER_TURNS,
        CUSTOMER_SUPPORT_PERSONA_POOL,
    )
    scenario_facts  = CUSTOMER_SUPPORT_FACTS
    scenario_filler = CUSTOMER_SUPPORT_FILLER_TURNS
    scenario_pool   = CUSTOMER_SUPPORT_PERSONA_POOL

Acceptance criteria

  • simulator/scenarios/customer_support.py exists
  • CUSTOMER_SUPPORT_FACTS has exactly 8 Fact objects
  • At least 2 facts have updated_at set with different updated_value
  • CUSTOMER_SUPPORT_PERSONA_POOL has 5 persona fact-lists
  • CUSTOMER_SUPPORT_FILLER_TURNS has ≥ 20 domain-specific questions
  • python main.py --scenario customer_support --turns 60 runs end-to-end without error
  • python main.py --scenario customer_support --seeds 3 runs multi-seed
  • One structural test added to tests/test_pipeline.py:
    • test_customer_support_scenario_structure() — 8 facts, 5 personas, ≥ 20 fillers
  • All existing tests still pass

Technical hints

  • injected_at values should be spread across early turns (0–15) so facts are injected before the first checkpoint (T=10 or T=25)
  • updated_at should be between T=25 and T=75 so drift is measurable at multiple checkpoints
  • Keep fact values as simple strings without special characters — they're used as substring search targets in recall_at_t()
  • The Fact.injection_text() method returns "My <key> is <value>." — your key/value choices affect how natural this sounds in the simulated conversation

Getting started

# Copy edtech.py as a starting template:
cp simulator/scenarios/edtech.py simulator/scenarios/customer_support.py
# Adapt the facts, personas, and filler turns
python main.py --scenario customer_support --turns 50
pytest tests/ -v

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions