diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
new file mode 100644
index 00000000..9fbdb98f
--- /dev/null
+++ b/.claude-plugin/marketplace.json
@@ -0,0 +1,19 @@
+{
+ "name": "Abhishek-tools",
+ "owner": {
+ "name": "Abhishek",
+ "email": "abgupta@yahoo.com"
+ },
+
+ "plugins": [
+ {
+ "name": "independent-reviewer",
+ "source": "./independent-reviewer",
+ "description": "Carry out an independent review of all changes since last commit",
+ "version": "1.0.0",
+ "author": {
+ "name": "Abhishek"
+ }
+ }
+ ]
+}
\ No newline at end of file
diff --git a/.claude/agent-memory/commit-diff-reviewer/MEMORY.md b/.claude/agent-memory/commit-diff-reviewer/MEMORY.md
new file mode 100644
index 00000000..4622a7b6
--- /dev/null
+++ b/.claude/agent-memory/commit-diff-reviewer/MEMORY.md
@@ -0,0 +1,3 @@
+# Agent Memory Index
+
+- [User Profile](user_profile.md) — Senior developer building FinAlly AI trading workstation; prefers terse, direct feedback
diff --git a/.claude/agents/change-reviewer.md b/.claude/agents/change-reviewer.md
new file mode 100644
index 00000000..d6063884
--- /dev/null
+++ b/.claude/agents/change-reviewer.md
@@ -0,0 +1,6 @@
+---
+name: change-reviewer
+description: Carry out a comprehensive review of all changes since last commit
+---
+
+The sub agent reviews all changes since the last commit. Write your results to planning/SINCE-COMMIT-REVIEW2.md
diff --git a/.claude/agents/codex-reviewer.md b/.claude/agents/codex-reviewer.md
new file mode 100644
index 00000000..4dc1993a
--- /dev/null
+++ b/.claude/agents/codex-reviewer.md
@@ -0,0 +1,10 @@
+---
+name: codex-reviewer
+description: Carry out a comprehensive review of PLAN.md when requested using codex
+---
+
+You are using a different AI Agent to carry out a review of the document: planning/PLAN.md.
+You MUST execute the following shell command to carry out the review - do not review yourself:
+`codex exec "Please review the file planning/PLAN.md and write your results to planning/REVIEW.md"`
+This will run the review process and save the results.
+Do not review yourself.
\ No newline at end of file
diff --git a/.claude/agents/commit-diff-reviewer.md b/.claude/agents/commit-diff-reviewer.md
new file mode 100644
index 00000000..b8ec3d3a
--- /dev/null
+++ b/.claude/agents/commit-diff-reviewer.md
@@ -0,0 +1,224 @@
+---
+name: "commit-diff-reviewer"
+description: "Use this agent when you want to review all changes made since the last git commit and document the findings. This agent should be invoked after a coding session or when you want a structured audit of uncommitted work.\\n\\n\\nContext: The user has been making changes to the FinAlly project and wants a review of all uncommitted changes before committing.\\nuser: \"Can you review everything I've changed since my last commit?\"\\nassistant: \"I'll launch the commit-diff-reviewer agent to analyze all changes since the last commit and write the results to planning/SINCE-COMMIT-REVIEW2.md.\"\\n\\nThe user wants a review of uncommitted changes, so use the Agent tool to launch the commit-diff-reviewer agent.\\n\\n\\n\\n\\nContext: A developer has completed a feature and wants an audit before pushing.\\nuser: \"I think I'm done with the portfolio heatmap feature. Can you check what's changed?\"\\nassistant: \"Let me use the commit-diff-reviewer agent to review all changes since the last commit and document the findings.\"\\n\\nSince the user wants a review of recent changes, use the Agent tool to launch the commit-diff-reviewer agent to analyze the diff and write to planning/SINCE-COMMIT-REVIEW2.md.\\n\\n"
+tools: Edit, NotebookEdit, Write
+model: sonnet
+color: purple
+memory: project
+---
+
+You are an expert code reviewer specializing in full-stack TypeScript/Python applications. Your task is to review all changes made since the last git commit and produce a structured, actionable review document.
+
+## Your Process
+
+1. **Gather the diff**: Run `git diff HEAD` to see all unstaged changes, and `git diff --cached HEAD` to see staged changes. Also run `git status` to get the full picture of new/modified/deleted files.
+
+2. **Inspect new files**: For any new untracked files, read their contents directly since they won't appear in `git diff HEAD`.
+
+3. **Analyze the changes** with these lenses:
+ - **Correctness**: Does the logic match the project spec in PLAN.md? Are edge cases handled?
+ - **Code quality**: Is it simple and clear? Are names self-documenting? Are functions short and focused?
+ - **Project conventions**: Does it follow the patterns established in this codebase? (uv for Python, no emojis, no workarounds, no defensive programming, no over-engineering)
+ - **API contract adherence**: Do backend endpoints match the shapes defined in PLAN.md section 8? Do frontend calls match?
+ - **Potential bugs**: Race conditions, missing error handling where genuinely needed, off-by-one errors, incorrect calculations
+ - **Security/data integrity**: SQLite transactions, input validation, trade execution atomicity
+
+4. **Write the review** to `planning/SINCE-COMMIT-REVIEW2.md`, overwriting any existing content.
+
+## Output Format
+
+Write the review document in this structure:
+
+```markdown
+# Code Review — Changes Since Last Commit
+
+**Reviewed at**:
+**Files changed**:
+
+## Summary
+<2-4 sentence high-level summary of what changed and overall quality assessment>
+
+## Files Reviewed
+
+
+## Issues
+
+### Critical
+
+- [FILE:LINE] Description of issue and why it matters
+
+### Warnings
+
+- [FILE:LINE] Description
+
+### Suggestions
+
+- [FILE:LINE] Description
+
+## Spec Compliance
+
+
+## Verdict
+
+
+```
+
+## Project Context to Keep in Mind
+
+- This is the FinAlly AI trading workstation project
+- Backend: FastAPI + Python managed with `uv`. Never use `pip`, always `uv add`/`uv run`
+- Frontend: Next.js TypeScript, static export, Tailwind CSS, Lightweight Charts (not Recharts)
+- Database: SQLite with lazy initialization, single user (`user_id="default"`)
+- Real-time: SSE (not WebSockets), named events: `snapshot`, `price`, `watchlist`, `heartbeat`
+- LLM: LiteLLM → OpenRouter → `openrouter/openai/gpt-oss-120b` with Cerebras, structured outputs
+- No emojis anywhere. No workarounds. No over-engineering. Simple, incremental, clear.
+- Trade execution must be atomic (SQLite transaction + per-user asyncio lock)
+- Zero-quantity positions: row kept in DB, `avg_cost` reset to 0, filtered from display
+
+## Quality Standards
+
+- Flag any use of `pip install` instead of `uv add`
+- Flag any use of `python3` instead of `uv run`
+- Flag any emojis in code, logs, or print statements
+- Flag over-engineered abstractions introduced prematurely
+- Flag workarounds that patch symptoms instead of fixing root causes
+- Flag any API response shapes that deviate from PLAN.md section 8
+- Flag missing input validation on trade endpoints
+- Confirm SSE event types match the protocol spec
+
+Be direct and specific. Every issue should include the file path and line number if applicable. Do not pad the review with praise — focus on what matters.
+
+# Persistent Agent Memory
+
+You have a persistent, file-based memory system at `/Users/abgupta/Projects/finally/.claude/agent-memory/commit-diff-reviewer/`. This directory already exists — write to it directly with the Write tool (do not run mkdir or check for its existence).
+
+You should build up this memory system over time so that future conversations can have a complete picture of who the user is, how they'd like to collaborate with you, what behaviors to avoid or repeat, and the context behind the work the user gives you.
+
+If the user explicitly asks you to remember something, save it immediately as whichever type fits best. If they ask you to forget something, find and remove the relevant entry.
+
+## Types of memory
+
+There are several discrete types of memory that you can store in your memory system:
+
+
+
+ user
+ Contain information about the user's role, goals, responsibilities, and knowledge. Great user memories help you tailor your future behavior to the user's preferences and perspective. Your goal in reading and writing these memories is to build up an understanding of who the user is and how you can be most helpful to them specifically. For example, you should collaborate with a senior software engineer differently than a student who is coding for the very first time. Keep in mind, that the aim here is to be helpful to the user. Avoid writing memories about the user that could be viewed as a negative judgement or that are not relevant to the work you're trying to accomplish together.
+ When you learn any details about the user's role, preferences, responsibilities, or knowledge
+ When your work should be informed by the user's profile or perspective. For example, if the user is asking you to explain a part of the code, you should answer that question in a way that is tailored to the specific details that they will find most valuable or that helps them build their mental model in relation to domain knowledge they already have.
+
+ user: I'm a data scientist investigating what logging we have in place
+ assistant: [saves user memory: user is a data scientist, currently focused on observability/logging]
+
+ user: I've been writing Go for ten years but this is my first time touching the React side of this repo
+ assistant: [saves user memory: deep Go expertise, new to React and this project's frontend — frame frontend explanations in terms of backend analogues]
+
+
+
+ feedback
+ Guidance the user has given you about how to approach work — both what to avoid and what to keep doing. These are a very important type of memory to read and write as they allow you to remain coherent and responsive to the way you should approach work in the project. Record from failure AND success: if you only save corrections, you will avoid past mistakes but drift away from approaches the user has already validated, and may grow overly cautious.
+ Any time the user corrects your approach ("no not that", "don't", "stop doing X") OR confirms a non-obvious approach worked ("yes exactly", "perfect, keep doing that", accepting an unusual choice without pushback). Corrections are easy to notice; confirmations are quieter — watch for them. In both cases, save what is applicable to future conversations, especially if surprising or not obvious from the code. Include *why* so you can judge edge cases later.
+ Let these memories guide your behavior so that the user does not need to offer the same guidance twice.
+ Lead with the rule itself, then a **Why:** line (the reason the user gave — often a past incident or strong preference) and a **How to apply:** line (when/where this guidance kicks in). Knowing *why* lets you judge edge cases instead of blindly following the rule.
+
+ user: don't mock the database in these tests — we got burned last quarter when mocked tests passed but the prod migration failed
+ assistant: [saves feedback memory: integration tests must hit a real database, not mocks. Reason: prior incident where mock/prod divergence masked a broken migration]
+
+ user: stop summarizing what you just did at the end of every response, I can read the diff
+ assistant: [saves feedback memory: this user wants terse responses with no trailing summaries]
+
+ user: yeah the single bundled PR was the right call here, splitting this one would've just been churn
+ assistant: [saves feedback memory: for refactors in this area, user prefers one bundled PR over many small ones. Confirmed after I chose this approach — a validated judgment call, not a correction]
+
+
+
+ project
+ Information that you learn about ongoing work, goals, initiatives, bugs, or incidents within the project that is not otherwise derivable from the code or git history. Project memories help you understand the broader context and motivation behind the work the user is doing within this working directory.
+ When you learn who is doing what, why, or by when. These states change relatively quickly so try to keep your understanding of this up to date. Always convert relative dates in user messages to absolute dates when saving (e.g., "Thursday" → "2026-03-05"), so the memory remains interpretable after time passes.
+ Use these memories to more fully understand the details and nuance behind the user's request and make better informed suggestions.
+ Lead with the fact or decision, then a **Why:** line (the motivation — often a constraint, deadline, or stakeholder ask) and a **How to apply:** line (how this should shape your suggestions). Project memories decay fast, so the why helps future-you judge whether the memory is still load-bearing.
+
+ user: we're freezing all non-critical merges after Thursday — mobile team is cutting a release branch
+ assistant: [saves project memory: merge freeze begins 2026-03-05 for mobile release cut. Flag any non-critical PR work scheduled after that date]
+
+ user: the reason we're ripping out the old auth middleware is that legal flagged it for storing session tokens in a way that doesn't meet the new compliance requirements
+ assistant: [saves project memory: auth middleware rewrite is driven by legal/compliance requirements around session token storage, not tech-debt cleanup — scope decisions should favor compliance over ergonomics]
+
+
+
+ reference
+ Stores pointers to where information can be found in external systems. These memories allow you to remember where to look to find up-to-date information outside of the project directory.
+ When you learn about resources in external systems and their purpose. For example, that bugs are tracked in a specific project in Linear or that feedback can be found in a specific Slack channel.
+ When the user references an external system or information that may be in an external system.
+
+ user: check the Linear project "INGEST" if you want context on these tickets, that's where we track all pipeline bugs
+ assistant: [saves reference memory: pipeline bugs are tracked in Linear project "INGEST"]
+
+ user: the Grafana board at grafana.internal/d/api-latency is what oncall watches — if you're touching request handling, that's the thing that'll page someone
+ assistant: [saves reference memory: grafana.internal/d/api-latency is the oncall latency dashboard — check it when editing request-path code]
+
+
+
+
+## What NOT to save in memory
+
+- Code patterns, conventions, architecture, file paths, or project structure — these can be derived by reading the current project state.
+- Git history, recent changes, or who-changed-what — `git log` / `git blame` are authoritative.
+- Debugging solutions or fix recipes — the fix is in the code; the commit message has the context.
+- Anything already documented in CLAUDE.md files.
+- Ephemeral task details: in-progress work, temporary state, current conversation context.
+
+These exclusions apply even when the user explicitly asks you to save. If they ask you to save a PR list or activity summary, ask what was *surprising* or *non-obvious* about it — that is the part worth keeping.
+
+## How to save memories
+
+Saving a memory is a two-step process:
+
+**Step 1** — write the memory to its own file (e.g., `user_role.md`, `feedback_testing.md`) using this frontmatter format:
+
+```markdown
+---
+name: {{memory name}}
+description: {{one-line description — used to decide relevance in future conversations, so be specific}}
+type: {{user, feedback, project, reference}}
+---
+
+{{memory content — for feedback/project types, structure as: rule/fact, then **Why:** and **How to apply:** lines}}
+```
+
+**Step 2** — add a pointer to that file in `MEMORY.md`. `MEMORY.md` is an index, not a memory — each entry should be one line, under ~150 characters: `- [Title](file.md) — one-line hook`. It has no frontmatter. Never write memory content directly into `MEMORY.md`.
+
+- `MEMORY.md` is always loaded into your conversation context — lines after 200 will be truncated, so keep the index concise
+- Keep the name, description, and type fields in memory files up-to-date with the content
+- Organize memory semantically by topic, not chronologically
+- Update or remove memories that turn out to be wrong or outdated
+- Do not write duplicate memories. First check if there is an existing memory you can update before writing a new one.
+
+## When to access memories
+- When memories seem relevant, or the user references prior-conversation work.
+- You MUST access memory when the user explicitly asks you to check, recall, or remember.
+- If the user says to *ignore* or *not use* memory: Do not apply remembered facts, cite, compare against, or mention memory content.
+- Memory records can become stale over time. Use memory as context for what was true at a given point in time. Before answering the user or building assumptions based solely on information in memory records, verify that the memory is still correct and up-to-date by reading the current state of the files or resources. If a recalled memory conflicts with current information, trust what you observe now — and update or remove the stale memory rather than acting on it.
+
+## Before recommending from memory
+
+A memory that names a specific function, file, or flag is a claim that it existed *when the memory was written*. It may have been renamed, removed, or never merged. Before recommending it:
+
+- If the memory names a file path: check the file exists.
+- If the memory names a function or flag: grep for it.
+- If the user is about to act on your recommendation (not just asking about history), verify first.
+
+"The memory says X exists" is not the same as "X exists now."
+
+A memory that summarizes repo state (activity logs, architecture snapshots) is frozen in time. If the user asks about *recent* or *current* state, prefer `git log` or reading the code over recalling the snapshot.
+
+## Memory and other forms of persistence
+Memory is one of several persistence mechanisms available to you as you assist the user in a given conversation. The distinction is often that memory can be recalled in future conversations and should not be used for persisting information that is only useful within the scope of the current conversation.
+- When to use or update a plan instead of memory: If you are about to start a non-trivial implementation task and would like to reach alignment with the user on your approach you should use a Plan rather than saving this information to memory. Similarly, if you already have a plan within the conversation and you have changed your approach persist that change by updating the plan rather than saving a memory.
+- When to use or update tasks instead of memory: When you need to break your work in current conversation into discrete steps or keep track of your progress use tasks instead of saving to memory. Tasks are great for persisting information about the work that needs to be done in the current conversation, but memory should be reserved for information that will be useful in future conversations.
+
+- Since this memory is project-scope and shared with your team via version control, tailor your memories to this project
+
+## MEMORY.md
+
+Your MEMORY.md is currently empty. When you save new memories, they will appear here.
diff --git a/.claude/commands/doc-review.md b/.claude/commands/doc-review.md
new file mode 100644
index 00000000..39c6890c
--- /dev/null
+++ b/.claude/commands/doc-review.md
@@ -0,0 +1 @@
+Review the documentation file in the planning folder called $ARGUMENTS and add questions, clarifications, or feedback to a new section at the end, along with any opportunities to simplify.
\ No newline at end of file
diff --git a/.claude/settings.json b/.claude/settings.json
index aa06f43d..cbcc653b 100644
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -2,6 +2,7 @@
"enabledPlugins": {
"frontend-design@claude-plugins-official": true,
"context7@claude-plugins-official": true,
- "playwright@claude-plugins-official": true
+ "playwright@claude-plugins-official": true,
+ "independent-reviewer@Abhishek-tools": true
}
}
diff --git a/.claude/skills/cerebras/SKILL.md b/.claude/skills/cerebras/SKILL.md
index 9efd01a3..19d5ec71 100644
--- a/.claude/skills/cerebras/SKILL.md
+++ b/.claude/skills/cerebras/SKILL.md
@@ -1,5 +1,5 @@
---
-name: cerebras-inference
+name: cerebras
description: Use this to write code to call an LLM using LiteLLM and OpenRouter with the Cerebras inference provider
---
diff --git a/.github/workflows/claude.yml b/.github/workflows/claude.yml
index d300267f..6b15fac7 100644
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -46,5 +46,5 @@ jobs:
# Optional: Add claude_args to customize behavior and configuration
# See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md
# or https://code.claude.com/docs/en/cli-reference for available options
- # claude_args: '--allowed-tools Bash(gh pr:*)'
+ # claude_args: '--allowed-tools Bash(gh pr *)'
diff --git a/.gitignore b/.gitignore
index b7faf403..8df6319f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -205,3 +205,6 @@ cython_debug/
marimo/_static/
marimo/_lsp/
__marimo__/
+
+# Claude Code local settings
+.claude/settings.local.json
diff --git a/README.md b/README.md
index 3f2582ae..5323fd16 100644
--- a/README.md
+++ b/README.md
@@ -1,62 +1,54 @@
# FinAlly — AI Trading Workstation
-A visually stunning AI-powered trading workstation that streams live market data, simulates portfolio trading, and integrates an LLM chat assistant that can analyze positions and execute trades via natural language.
+A Bloomberg-terminal-inspired trading simulator with live market data and an AI assistant that can analyze your portfolio and execute trades via natural language.
-Built entirely by coding agents as a capstone project for an agentic AI coding course.
-
-## Features
-
-- **Live price streaming** via SSE with green/red flash animations
-- **Simulated portfolio** — $10k virtual cash, market orders, instant fills
-- **Portfolio visualizations** — heatmap (treemap), P&L chart, positions table
-- **AI chat assistant** — analyzes holdings, suggests and auto-executes trades
-- **Watchlist management** — track tickers manually or via AI
-- **Dark terminal aesthetic** — Bloomberg-inspired, data-dense layout
+## Quick Start
-## Architecture
+```bash
+cp .env.example .env
+# Edit .env: add OPENROUTER_API_KEY (required), MASSIVE_API_KEY (optional)
+./scripts/start_mac.sh
+```
-Single Docker container serving everything on port 8000:
+Open [http://localhost:8000](http://localhost:8000).
-- **Frontend**: Next.js (static export) with TypeScript and Tailwind CSS
-- **Backend**: FastAPI (Python/uv) with SSE streaming
-- **Database**: SQLite with lazy initialization
-- **AI**: LiteLLM → OpenRouter (Cerebras inference) with structured outputs
-- **Market data**: Built-in GBM simulator (default) or Massive API (optional)
+## Features
-## Quick Start
+- **Live price streaming** via SSE — prices flash green/red on change
+- **Simulated portfolio** — $10k virtual cash, market orders, instant fill
+- **Sparklines & charts** — per-ticker mini-charts and a detailed main chart
+- **Portfolio heatmap** — treemap sized by weight, colored by P&L
+- **AI chat** — ask questions, get analysis, execute trades in plain English
-```bash
-# Clone and configure
-cp .env.example .env
-# Add your OPENROUTER_API_KEY to .env
+## Architecture
-# Run with Docker
-docker build -t finally .
-docker run -v finally-data:/app/db -p 8000:8000 --env-file .env finally
+Single Docker container, single port (8000):
-# Open http://localhost:8000
-```
+- **Frontend**: Next.js static export, served by FastAPI
+- **Backend**: FastAPI + Python (uv), SQLite database
+- **Market data**: GBM simulator by default; Massive/Polygon.io API if `MASSIVE_API_KEY` is set
+- **AI**: LiteLLM → OpenRouter (Cerebras), structured outputs for trade execution
## Environment Variables
| Variable | Required | Description |
|---|---|---|
-| `OPENROUTER_API_KEY` | Yes | OpenRouter API key for AI chat |
-| `MASSIVE_API_KEY` | No | Massive (Polygon.io) key for real market data; omit to use simulator |
-| `LLM_MOCK` | No | Set `true` for deterministic mock LLM responses (testing) |
+| `OPENROUTER_API_KEY` | Yes | LLM inference via OpenRouter |
+| `MASSIVE_API_KEY` | No | Real market data; simulator used if unset |
+| `LLM_MOCK` | No | Set `true` for deterministic mock responses (testing) |
-## Project Structure
+## Development
-```
-finally/
-├── frontend/ # Next.js static export
-├── backend/ # FastAPI uv project
-├── planning/ # Project documentation and agent contracts
-├── test/ # Playwright E2E tests
-├── db/ # SQLite volume mount (runtime)
-└── scripts/ # Start/stop helpers
+```bash
+# Backend tests
+cd backend && uv run pytest -v
+
+# Frontend
+cd frontend && npm install && npm run dev
```
-## License
+## Running Tests (E2E)
-See [LICENSE](LICENSE).
+```bash
+cd test && docker compose -f docker-compose.test.yml up
+```
diff --git a/independent-reviewer/.DS_Store b/independent-reviewer/.DS_Store
new file mode 100644
index 00000000..5f53ed91
Binary files /dev/null and b/independent-reviewer/.DS_Store differ
diff --git a/independent-reviewer/.claude-plugin/plugin.json b/independent-reviewer/.claude-plugin/plugin.json
new file mode 100644
index 00000000..f2379ba7
--- /dev/null
+++ b/independent-reviewer/.claude-plugin/plugin.json
@@ -0,0 +1,5 @@
+{
+ "name": "independent-reviewer",
+ "description": "Carry out an independent review of all changes since last commit",
+ "version": "1.0.0"
+}
\ No newline at end of file
diff --git a/independent-reviewer/hooks/.DS_Store b/independent-reviewer/hooks/.DS_Store
new file mode 100644
index 00000000..9afa9884
Binary files /dev/null and b/independent-reviewer/hooks/.DS_Store differ
diff --git a/independent-reviewer/hooks/hooks.json b/independent-reviewer/hooks/hooks.json
new file mode 100644
index 00000000..10aeb7a2
--- /dev/null
+++ b/independent-reviewer/hooks/hooks.json
@@ -0,0 +1,15 @@
+{
+ "hooks": {
+ "Stop": [
+ {
+ "matcher": "",
+ "hooks": [
+ {
+ "type": "command",
+ "command": "git status"
+ }
+ ]
+ }
+ ]
+ }
+}
diff --git a/planning/DECISIONS.md b/planning/DECISIONS.md
new file mode 100644
index 00000000..894b23a3
--- /dev/null
+++ b/planning/DECISIONS.md
@@ -0,0 +1,54 @@
+# Planning Decisions
+
+This file records concrete decisions made to resolve open questions and contract gaps in `planning/PLAN.md`.
+
+## 2026-04-13
+
+### LLM configuration
+
+- `OPENROUTER_API_KEY` is required only when `LLM_MOCK=false`.
+- `LLM_MOCK=true` is a supported no-key development and test mode.
+- The backend should fail fast at startup if mock mode is off and the API key is missing.
+
+Reasoning: this preserves the intended low-friction local and CI workflow while keeping production configuration errors obvious.
+
+### Docker persistence
+
+- Local development and test runs use a bind mount from repo `db/` to `/app/db`.
+- The plan no longer mixes bind mounts with a named Docker volume.
+- `db/finally.db` is intentionally visible on the host for inspection and persistence.
+
+Reasoning: one persistence model is easier for agents to implement consistently, and host-visible SQLite data is useful for a course project.
+
+### Chat API contract
+
+- `/api/chat` returns persisted `user_message` and `assistant_message` objects, including message IDs and timestamps.
+- Action results are returned only after execution, under `assistant_message.actions`.
+- Partial failures are represented per action with `status` plus `error`, while the overall response remains `200` for valid requests.
+- The response also includes post-execution `portfolio` and `watchlist` state so the frontend can reconcile immediately.
+
+Reasoning: this removes frontend/backend ambiguity around inline confirmations, persisted message identity, and partial trade failures.
+
+### Position lifecycle
+
+- Selling a position to zero keeps the row but resets `avg_cost` to `0`.
+- A later buy in the same ticker establishes a brand-new cost basis.
+
+Reasoning: unrealized P&L after re-entry should reflect only the new position, not stale historical basis.
+
+### Trade validation
+
+- Tickers are normalized to uppercase and trimmed.
+- Unsupported symbols are rejected.
+- Quantity must be finite, positive, and no more than 4 decimal places.
+- Manual and LLM-originated trades use the exact same validation rules.
+
+Reasoning: shared validation rules prevent drift between frontend behavior, direct API usage, and AI-triggered actions.
+
+### SSE protocol
+
+- `/api/stream/prices` uses named SSE events: `snapshot`, `price`, `watchlist`, and `heartbeat`.
+- The server sends an initial `snapshot` immediately on connect.
+- Watchlist add/remove operations emit a `watchlist` event on existing connections rather than requiring reconnect.
+
+Reasoning: explicit event types give frontend and backend a stable contract for initial render, live updates, and reconnect behavior.
diff --git a/planning/PLAN.md b/planning/PLAN.md
index bc1811b3..19c9f4a1 100644
--- a/planning/PLAN.md
+++ b/planning/PLAN.md
@@ -14,7 +14,7 @@ This is the capstone project for an agentic AI coding course. It is built entire
The user runs a single Docker command (or a provided start script). A browser opens to `http://localhost:8000`. No login, no signup. They immediately see:
-- A watchlist of 10 default tickers with live-updating prices in a grid
+- A watchlist of 10 default tickers, which legendary investor Warren Buffett would approve for investing now, with live-updating prices in a grid
- $10,000 in virtual cash
- A dark, data-rich trading terminal aesthetic
- An AI chat panel ready to assist
@@ -23,7 +23,7 @@ The user runs a single Docker command (or a provided start script). A browser op
- **Watch prices stream** — prices flash green (uptick) or red (downtick) with subtle CSS animations that fade
- **View sparkline mini-charts** — price action beside each ticker in the watchlist, accumulated on the frontend from the SSE stream since page load (sparklines fill in progressively)
-- **Click a ticker** to see a larger detailed chart in the main chart area
+- **Click a ticker** to see a larger detailed chart in the main chart area, populated from SSE price history accumulated since page load
- **Buy and sell shares** — market orders only, instant fill at current price, no fees, no confirmation dialog
- **Monitor their portfolio** — a heatmap (treemap) showing positions sized by weight and colored by P&L, plus a P&L chart tracking total portfolio value over time
- **View a positions table** — ticker, quantity, average cost, current price, unrealized P&L, % change
@@ -121,7 +121,7 @@ finally/
## 5. Environment Variables
```bash
-# Required: OpenRouter API key for LLM chat functionality
+# Required only when LLM_MOCK=false
OPENROUTER_API_KEY=your-openrouter-api-key-here
# Optional: Massive (Polygon.io) API key for real market data
@@ -136,7 +136,8 @@ LLM_MOCK=false
- If `MASSIVE_API_KEY` is set and non-empty → backend uses Massive REST API for market data
- If `MASSIVE_API_KEY` is absent or empty → backend uses the built-in market simulator
-- If `LLM_MOCK=true` → backend returns deterministic mock LLM responses (for E2E tests)
+- If `LLM_MOCK=true` → backend returns deterministic mock LLM responses (for E2E tests), and `OPENROUTER_API_KEY` is not required
+- If `LLM_MOCK=false` and `OPENROUTER_API_KEY` is absent or empty → backend startup should fail fast with a clear configuration error
- The backend reads `.env` from the project root (mounted into the container or read via docker `--env-file`)
---
@@ -155,6 +156,7 @@ Both the simulator and the Massive client implement the same abstract interface.
- Occasional random "events" — sudden 2-5% moves on a ticker for drama
- Starts from realistic seed prices (e.g., AAPL ~$190, GOOGL ~$175, etc.)
- Runs as an in-process background task — no external dependencies
+- The seed price at startup is stored as the synthetic **previous close** for each ticker, used to compute daily change %
### Massive API (Optional)
@@ -167,7 +169,7 @@ Both the simulator and the Massive client implement the same abstract interface.
### Shared Price Cache
- A single background task (simulator or Massive poller) writes to an in-memory price cache
-- The cache holds the latest price, previous price, and timestamp for each ticker
+- The cache holds the following per ticker: latest price, previous price (last tick, for flash direction), previous close (session-start seed or API-provided, for daily change %), and timestamp
- SSE streams read from this cache and push updates to connected clients
- This architecture supports future multi-user scenarios without changes to the data layer
@@ -175,8 +177,19 @@ Both the simulator and the Massive client implement the same abstract interface.
- Endpoint: `GET /api/stream/prices`
- Long-lived SSE connection; client uses native `EventSource` API
-- Server pushes price updates for all tickers known to the system at a regular cadence (~500ms) — in the single-user model this is equivalent to the user's watchlist
-- Each SSE event contains ticker, price, previous price, timestamp, and change direction
+- Server pushes price updates **only when a price changes**, for **watchlist tickers only** — events are not repeated if prices are unchanged (e.g., between Massive API polls)
+- The protocol uses named SSE events so frontend and backend share one contract:
+ - `snapshot` — sent immediately after connection opens, containing the full current watchlist payload so the UI does not render empty prices/charts while waiting for the next tick
+ - `price` — sent whenever a watched ticker changes price
+ - `watchlist` — sent after add/remove actions so the existing connection becomes watchlist-aware without reconnecting
+ - `heartbeat` — sent every ~15 seconds when no other events are emitted, allowing the client to distinguish an idle stream from a dead one
+- `snapshot` payload:
+ - `tickers`: array of `{ticker, price, previous_price, previous_close, change_pct, timestamp, direction}`
+- `price` payload:
+ - `ticker`, `price`, `previous_price`, `previous_close`, `change_pct` (daily, vs previous close), `timestamp`, and `direction` (`"up"` | `"down"`)
+- `watchlist` payload:
+ - `action` (`"added"` | `"removed"`), `ticker`, and `watchlist` (the full updated ticker list)
+- The SSE stream is **watchlist-aware**: when the user adds or removes a ticker, the backend dynamically updates which tickers are streamed on the existing connection and emits a `watchlist` event — no reconnect needed
- Client handles reconnection automatically (EventSource has built-in retry)
---
@@ -215,6 +228,8 @@ All tables include a `user_id` column defaulting to `"default"`. This is hardcod
- `avg_cost` REAL
- `updated_at` TEXT (ISO timestamp)
- UNIQUE constraint on `(user_id, ticker)`
+- Note: selling all shares sets `quantity` to 0 — the row is **not deleted**. The frontend filters out zero-quantity rows from the positions table and heatmap display.
+- When a position is sold down to zero, `avg_cost` is reset to `0`. A later buy in the same ticker creates a fresh cost basis from scratch rather than reusing historical average cost.
**trades** — Trade history (append-only log)
- `id` TEXT PRIMARY KEY (UUID)
@@ -236,7 +251,7 @@ All tables include a `user_id` column defaulting to `"default"`. This is hardcod
- `user_id` TEXT (default: `"default"`)
- `role` TEXT (`"user"` or `"assistant"`)
- `content` TEXT
-- `actions` TEXT (JSON — trades executed, watchlist changes made; null for user messages)
+- `actions` TEXT (JSON — trades executed, watchlist changes made; null for user messages). The frontend reads this field to render inline confirmations (e.g. "Bought 5 AAPL @ $191.20") directly in the chat bubble for the assistant's message.
- `created_at` TEXT (ISO timestamp)
### Default Seed Data
@@ -260,6 +275,37 @@ All tables include a `user_id` column defaulting to `"default"`. This is hardcod
| POST | `/api/portfolio/trade` | Execute a trade: `{ticker, quantity, side}` |
| GET | `/api/portfolio/history` | Portfolio value snapshots over time (for P&L chart) |
+**`GET /api/portfolio` response shape:**
+```json
+{
+ "cash_balance": 8432.50,
+ "total_value": 12847.30,
+ "positions": [
+ {
+ "ticker": "AAPL",
+ "quantity": 10,
+ "avg_cost": 189.50,
+ "current_price": 193.20,
+ "market_value": 1932.00,
+ "unrealized_pnl": 37.00,
+ "pnl_pct": 1.95
+ }
+ ]
+}
+```
+Only positions with `quantity > 0` are included. `total_value` = `cash_balance` + sum of all `market_value`.
+
+**Trade validation rules** for manual and LLM-driven orders:
+- `ticker` is normalized to uppercase and trimmed before validation and persistence
+- `ticker` must be in the supported market data universe; unknown symbols are rejected with `400`
+- `side` must be exactly `"buy"` or `"sell"`
+- `quantity` must parse as a finite positive number greater than `0`
+- Fractional shares are supported to at most 4 decimal places; values with higher precision are rejected rather than rounded implicitly
+- Buy orders require sufficient cash at the current cached market price
+- Sell orders require sufficient owned quantity in the current position
+- Validation errors return structured error payloads that the frontend can render inline
+- Trade execution is wrapped in a SQLite transaction: cash debit/credit and position update are atomic. Concurrent requests for the same user are serialized via a per-user asyncio lock to prevent double-spend.
+
### Watchlist
| Method | Path | Description |
|--------|------|-------------|
@@ -270,8 +316,68 @@ All tables include a `user_id` column defaulting to `"default"`. This is hardcod
### Chat
| Method | Path | Description |
|--------|------|-------------|
+| GET | `/api/chat/history` | Recent chat messages (last 50, chronological order) |
| POST | `/api/chat` | Send a message, receive complete JSON response (message + executed actions) |
+**`POST /api/chat` request shape:**
+```json
+{
+ "message": "Buy 5 shares of AAPL and add AMD to my watchlist"
+}
+```
+
+**`POST /api/chat` response shape:**
+```json
+{
+ "user_message": {
+ "id": "uuid-user-message",
+ "role": "user",
+ "content": "Buy 5 shares of AAPL and add AMD to my watchlist",
+ "created_at": "2026-04-13T18:00:00Z"
+ },
+ "assistant_message": {
+ "id": "uuid-assistant-message",
+ "role": "assistant",
+ "content": "Bought 5 shares of AAPL and added AMD to your watchlist.",
+ "actions": {
+ "trades": [
+ {
+ "ticker": "AAPL",
+ "side": "buy",
+ "requested_quantity": 5,
+ "status": "executed",
+ "executed_quantity": 5,
+ "executed_price": 191.2,
+ "trade_id": "uuid-trade",
+ "error": null
+ }
+ ],
+ "watchlist_changes": [
+ {
+ "ticker": "AMD",
+ "action": "add",
+ "status": "executed",
+ "error": null
+ }
+ ]
+ },
+ "created_at": "2026-04-13T18:00:01Z"
+ },
+ "portfolio": {
+ "cash_balance": 9044.0,
+ "total_value": 10000.0,
+ "positions": []
+ },
+ "watchlist": ["AAPL", "AMD", "AMZN"]
+}
+```
+
+Response contract notes:
+- The backend persists the user message first, then executes requested actions, then persists the assistant message with final action results
+- `assistant_message.actions` always reflects post-execution results, not raw LLM intent
+- Partial failure is allowed: some actions may be `executed` while others are `rejected`
+- `portfolio` and `watchlist` reflect post-execution state so the frontend can reconcile immediately without an additional fetch
+
### System
| Method | Path | Description |
|--------|------|-------------|
@@ -281,7 +387,7 @@ All tables include a `user_id` column defaulting to `"default"`. This is hardcod
## 9. LLM Integration
-When writing code to make calls to LLMs, use cerebras-inference skill to use LiteLLM via OpenRouter to the `openrouter/openai/gpt-oss-120b` model with Cerebras as the inference provider. Structured Outputs should be used to interpret the results.
+When writing code to make calls to LLMs, use cerebras skill to use LiteLLM via OpenRouter to the `openrouter/openai/gpt-oss-120b` model with Cerebras as the inference provider. Structured Outputs should be used to interpret the results.
There is an OPENROUTER_API_KEY in the .env file in the project root.
@@ -290,9 +396,9 @@ There is an OPENROUTER_API_KEY in the .env file in the project root.
When the user sends a chat message, the backend:
1. Loads the user's current portfolio context (cash, positions with P&L, watchlist with live prices, total portfolio value)
-2. Loads recent conversation history from the `chat_messages` table
+2. Loads recent conversation history from the `chat_messages` table (truncated to the last ~1024 tokens to bound context window cost)
3. Constructs a prompt with a system message, portfolio context, conversation history, and the user's new message
-4. Calls the LLM via LiteLLM → OpenRouter, requesting structured output, using the cerebras-inference skill
+4. Calls the LLM via LiteLLM → OpenRouter, requesting structured output, using the cerebras skill
5. Parses the complete structured JSON response
6. Auto-executes any trades or watchlist changes specified in the response
7. Stores the message and executed actions in `chat_messages`
@@ -315,8 +421,8 @@ The LLM is instructed to respond with JSON matching this schema:
```
- `message` (required): The conversational text shown to the user
-- `trades` (optional): Array of trades to auto-execute. Each trade goes through the same validation as manual trades (sufficient cash for buys, sufficient shares for sells)
-- `watchlist_changes` (optional): Array of watchlist modifications
+- `trades` (optional): Array of trades to auto-execute. Each trade goes through the same validation as manual trades, including ticker normalization, supported-symbol checks, positive finite quantity, max 4 decimal places, sufficient cash for buys, and sufficient shares for sells
+- `watchlist_changes` (optional): Array of watchlist modifications. `action` must be `"add"` or `"remove"` — no other values are valid
### Auto-Execution
@@ -325,7 +431,7 @@ Trades specified by the LLM execute automatically — no confirmation dialog. Th
- It creates an impressive, fluid demo experience
- It demonstrates agentic AI capabilities — the core theme of the course
-If a trade fails validation (e.g., insufficient cash), the error is included in the chat response so the LLM can inform the user.
+If a trade or watchlist change fails validation (e.g., insufficient cash or duplicate watchlist add), the backend still returns the assistant message but marks the individual action `status="rejected"` with an `error` string. The HTTP response remains `200` unless the overall request is malformed.
### System Prompt Guidance
@@ -339,8 +445,22 @@ The LLM should be prompted as "FinAlly, an AI trading assistant" with instructio
### LLM Mock Mode
-When `LLM_MOCK=true`, the backend returns deterministic mock responses instead of calling OpenRouter. This enables:
-- Fast, free, reproducible E2E tests
+When `LLM_MOCK=true`, the backend returns the following deterministic mock response regardless of input:
+
+```json
+{
+ "message": "I've reviewed your portfolio. To get you started, I'll buy 5 shares of AAPL.",
+ "trades": [
+ {"ticker": "AAPL", "side": "buy", "quantity": 5}
+ ],
+ "watchlist_changes": []
+}
+```
+
+If the mock trade fails validation (e.g. insufficient cash), the backend still returns the mock `message` but includes an `error` field on the failed trade entry so the frontend can display it inline.
+
+This fixed response enables:
+- Fast, free, reproducible E2E tests (trade execution and message rendering are both exercised)
- Development without an API key
- CI/CD pipelines
@@ -357,15 +477,16 @@ The frontend is a single-page application with a dense, terminal-inspired layout
- **Portfolio heatmap** — treemap visualization where each rectangle is a position, sized by portfolio weight, colored by P&L (green = profit, red = loss)
- **P&L chart** — line chart showing total portfolio value over time, using data from `portfolio_snapshots`
- **Positions table** — tabular view of all positions: ticker, quantity, avg cost, current price, unrealized P&L, % change
-- **Trade bar** — simple input area: ticker field, quantity field, buy button, sell button. Market orders, instant fill.
+- **Trade bar** — simple input area: ticker field, quantity field (supports fractional shares, e.g. 0.5), buy button, sell button. Market orders, instant fill. Clicking a ticker in the watchlist auto-populates the ticker field.
- **AI chat panel** — docked/collapsible sidebar. Message input, scrolling conversation history, loading indicator while waiting for LLM response. Trade executions and watchlist changes shown inline as confirmations.
- **Header** — portfolio total value (updating live), connection status indicator, cash balance
### Technical Notes
- Use `EventSource` for SSE connection to `/api/stream/prices`
-- Canvas-based charting library preferred (Lightweight Charts or Recharts) for performance
+- Use **Lightweight Charts** (TradingView's open-source library) for all charts: sparklines, main ticker chart, and P&L chart. Do not use Recharts.
- Price flash effect: on receiving a new price, briefly apply a CSS class with background color transition, then remove it
+- When SSE is disconnected: watchlist prices freeze, text turns a muted grey, and a "stale" label is shown per row. Normal styling is restored on reconnection.
- All API calls go to the same origin (`/api/*`) — no CORS configuration needed
- Tailwind CSS for styling with a custom dark theme
@@ -393,19 +514,19 @@ FastAPI serves the static frontend files and all API routes on port 8000.
### Docker Volume
-The SQLite database persists via a named Docker volume:
+The SQLite database persists via a bind mount from the repo's top-level `db/` directory:
```bash
-docker run -v finally-data:/app/db -p 8000:8000 --env-file .env finally
+docker run -v "$(pwd)/db:/app/db" -p 8000:8000 --env-file .env finally
```
-The `db/` directory in the project root maps to `/app/db` in the container. The backend writes `finally.db` to this path.
+The `db/` directory in the project root is the single source of truth for runtime persistence in local development and test runs. It maps to `/app/db` in the container, and the backend writes `finally.db` to this path. This keeps the SQLite file inspectable from the host and avoids ambiguity between bind mounts and named volumes.
### Start/Stop Scripts
**`scripts/start_mac.sh`** (macOS/Linux):
- Builds the Docker image if not already built (or if `--build` flag passed)
-- Runs the container with the volume mount, port mapping, and `.env` file
+- Runs the container with the bind mount to `./db`, port mapping, and `.env` file
- Prints the URL to access the app
- Optionally opens the browser
@@ -430,6 +551,7 @@ The container is designed to deploy to AWS App Runner, Render, or any container
**Backend (pytest)**:
- Market data: simulator generates valid prices, GBM math is correct, Massive API response parsing works, both implementations conform to the abstract interface
- Portfolio: trade execution logic, P&L calculations, edge cases (selling more than owned, buying with insufficient cash, selling at a loss)
+- Portfolio: zero-quantity position handling and average-cost reset on re-entry
- LLM: structured output parsing handles all valid schemas, graceful handling of malformed responses, trade validation within chat flow
- API routes: correct status codes, response shapes, error handling
@@ -450,7 +572,8 @@ The container is designed to deploy to AWS App Runner, Render, or any container
- Fresh start: default watchlist appears, $10k balance shown, prices are streaming
- Add and remove a ticker from the watchlist
- Buy shares: cash decreases, position appears, portfolio updates
-- Sell shares: cash increases, position updates or disappears
+- Sell shares: cash increases, position quantity updates (zero-quantity rows are filtered from display)
- Portfolio visualization: heatmap renders with correct colors, P&L chart has data points
- AI chat (mocked): send a message, receive a response, trade execution appears inline
- SSE resilience: disconnect and verify reconnection
+- SSE protocol: initial `snapshot` arrives on connect, `watchlist` events arrive after add/remove, and the UI recovers cleanly after heartbeat gaps/reconnect
diff --git a/planning/SINCE-COMMIT-REVIEW2.md b/planning/SINCE-COMMIT-REVIEW2.md
new file mode 100644
index 00000000..df5feb46
--- /dev/null
+++ b/planning/SINCE-COMMIT-REVIEW2.md
@@ -0,0 +1,24 @@
+# Code Review — Changes Since Last Commit
+
+**Reviewed at**: 2026-04-15
+**Last commit**: `6fd1fad — added subagents to use in claude code`
+**Files changed**: 0
+
+## Summary
+
+The working tree is completely clean. There are no staged changes, no unstaged modifications, and no untracked files. The repository is in an identical state to the last commit.
+
+## Issues
+
+### Critical
+None.
+
+### Warnings
+None.
+
+### Suggestions
+None.
+
+## Verdict
+
+**APPROVED** — No uncommitted changes exist. Working tree is clean.