# BaoLife MVP Completion Design

**Date:** 2026-04-13
**Owner:** Craig Vander Galien
**Goal:** Get BaoLife to TestFlight in a state where ten friends can play and answer "is this fun?" without hitting crashes or stuck states.
**Approach:** Option A — Simulator-First

---

## 1. Goal Definition

**MVP (Minimum Viable Product) is reached when:**

1. The TypeScript backend can simulate 1000 random-seed full lifetimes (birth to death or age 100) with **zero exceptions and zero stuck states**.
2. The iOS app passes a full automated test suite covering: launch, onboarding, first hour of gameplay, save/relaunch, and the eight primary screens via snapshot tests.
3. A signed iOS build is available on TestFlight, with crash reporting wired up.
4. A short release-note doc exists telling friends what to look for and how to report issues.

**MVP is explicitly NOT:**
- A guarantee the game is fun. Fun is a human judgment that can only happen after MVP.
- A perfect bug-free experience. It's "no walls, no crashes, all paths reachable."
- An App Store public release. That's v1, after friend feedback.

---

## 2. Current State Snapshot (2026-04-13)

| Component | Status |
|-----------|--------|
| TypeScript backend | Active, primary deploy target |
| Python backend (`ws/`) | Legacy reference only, kept on disk |
| Server tests | 1087/1090 passing (3 stale contract tests in one file) |
| iOS app | 316 files, has unit + UI tests, recent WebSocketService refactor complete |
| Android app | ~85% iOS parity, **out of MVP scope, deferred to v1.1** |
| Web frontend | Deprecated (`index.html`, `main.js`) — to be deleted |
| Legacy PHP API | Deprecated (`api/`) — to be deleted |
| Production server | `lichun-master` GCE instance, runs `baolife-websocket.service` (TS confirmed) |

**Recent activity:** 167 commits in last 90 days. Project is not dormant; it is mid-refactor and undertested.

---

## 3. Architecture Decisions

### 3.1 Simulator-First Ordering (Option A)

Backend bugs surface 10–100× faster in a headless simulator than in a UI test. We invest in the simulator before iOS test coverage so every backend bug fix happens against a fast, deterministic harness. iOS work follows once the backend is verified.

### 3.2 Single-Platform MVP

iOS only. Android stays in the repo but receives no attention until iOS ships to TestFlight. Reason: iOS already has UI test infrastructure, snapshot tooling integrates cleanly, and XcodeBuildMCP lets a sub-agent drive Xcode without human Xcode interaction.

### 3.3 Verification-Before-Completion Discipline

No phase is marked complete until its exit criteria are programmatically verified. No "should work" claims. The rules:

- Every backend change runs `npx tsc --noEmit && npx vitest run` before commit
- Every iOS change runs the iOS test suite via XcodeBuildMCP before commit
- Phase exits are commits; nothing half-done is left on the working tree
- Sub-agents that report success without verification artifacts are rejected and re-run

### 3.4 Sub-Agent Orchestration

The primary session (this one) holds the plan and never delegates synthesis. Sub-agents receive narrowly scoped tasks (one bug, one test file, one feature) with all context inline. After each sub-agent returns:
- Primary verifies actual file changes vs. claimed changes
- Primary runs the verification commands
- Primary updates TaskList state

### 3.5 Logging & Observability During Test Runs

The Monitor tool tails the dev server log when running E2E tests, so the primary session catches warnings and errors in real time without polling.

---

## 4. Phases

### Phase 0 — Cleanup & Truth (1 session)

**Tasks:**
1. SSH `lichun-master`, run `systemctl status baolife-websocket.service`, confirm it points to TS server (`server/dist/index.js` or equivalent). Document path.
2. Fix 3 stale contract tests in `server/tests/contracts/frontend-contract.regression.test.ts` — references `ios/lichunWebsocket/Core/Services/MessageRouter.swift` which was refactored away. Update to current path or remove the obsolete assertions.
3. Delete `index.html`, `main.js` (deprecated web frontend root files).
4. Delete `api/` directory (legacy PHP, replaced by Node push notification service).
5. Update `docs/PROJECT_STATUS.md` to 2026-04 reality: push notifications done, weather done, Sprint 1–3 events done.
6. Verify `npx tsc --noEmit && npx vitest run` exits clean.

**Exit Criteria:**
- 1090/1090 server tests passing (was 1087)
- Production backend identity documented in CLAUDE.md
- Dead code gone from working tree
- Single clean commit

### Phase 1 — Headless Lifetime Simulator (2–3 sessions)

**Goal:** Build a Vitest-based harness that can fake-play full lifetimes in milliseconds and assert correctness invariants.

**Tasks:**
1. Create `server/tests/e2e/lifetime-simulator.ts` — reusable harness:
   - In-process WebSocket server bootstrap (or direct PlayerSession instantiation)
   - Fake client that sends `init`, `characterSetup`, `start`, advances ticks, responds to questions
   - Configurable random seed for deterministic replays
   - Emits a structured trace (events fired, stat changes, deaths, marriages)
2. Create `server/tests/e2e/lifetime-simulator.test.ts` — scenarios:
   - **scenario:full-life** — birth to natural death, default question responses
   - **scenario:answer-all-yes** — every question answered with first option
   - **scenario:answer-all-no** — every question answered with last option
   - **scenario:fail-every-job** — bottoms-out career performance, asserts no crash
   - **scenario:marry-divorce-remarry** — exercises romance state machine
   - **scenario:age-100** — lives to age cap, asserts proper death handling
   - **scenario:random-seeds × 20** — different random seeds, all must complete
3. Invariants asserted on every tick:
   - No exceptions thrown
   - `player.gameSpeed !== SPEED_QUESTION_PAUSE` unless an unanswered question exists
   - All required `Player` and `Person` fields are non-null
   - `player.events` Set is consistent
   - Save → load → save round-trip produces equivalent state
   - Stat values stay within 0–100 bounds
   - Money stays >= 0 or has a documented debt path
4. Add `npm run e2e` script that runs simulator scenarios in <60s.
5. Wire into existing CI workflow.

**Exit Criteria:**
- `npm run e2e` runs 25+ simulated lifetimes in under 60 seconds
- All scenarios pass on a clean checkout
- Trace output is human-readable (for debugging in Phase 2)
- CI fails if simulator fails

**Risks & Mitigations:**
- *Risk:* In-process server initialization is hard. *Mitigation:* If WebSocket layer is too complex to fake, simulate at the `PlayerSession` level directly — game loop is what matters, not the wire format.
- *Risk:* Simulator passes but real game still crashes. *Mitigation:* Phase 3 iOS UI tests close that gap.

### Phase 2 — Bug Sweep Using Simulator Output (3–5 sessions)

**Goal:** Fix every bug the simulator catches, plus the known architectural issues already on the backlog.

**Tasks:**
1. Run simulator with verbose trace, collect all failures.
2. For each failure: spawn sub-agent with the failing scenario, the failing assertion, and the relevant source files. Sub-agent writes a focused failing test, fixes the bug, verifies the simulator now passes, commits.
3. Address known issue **C1**: function-based question events cannot deliver positive stat rewards (`createAnswerOption` only auto-deducts costs). Add reward application path. Test with a function-based question event that rewards happiness.
4. Audit `server/src/events/` for any event that mutates state outside the documented helpers; route it through `modifyStat()`.
5. Re-run full simulator after each fix; require it stays green.

**Exit Criteria:**
- 100 random-seed simulator runs all green
- Server test count stable or higher than start of phase (no skipped/disabled tests)
- Issue C1 closed with a regression test
- All commits include before/after evidence in commit message

### Phase 3 — iOS Coverage Sprint (3–5 sessions)

**Goal:** Build automated confidence that the iOS app launches, navigates, and renders without regressions.

**Tasks:**
1. Add `swift-snapshot-testing` (pointfreeco) as Swift Package dependency in `lichunWebsocketTests` target.
2. Create snapshot tests for primary screens (in `lichunWebsocketTests/Snapshots/`):
   - HomeView (default state, full state, dead state)
   - ActivitiesView
   - DatingSwipeView
   - MessagingListView + ConversationView
   - CharacterProfileView
   - EventModalView (message variant, question variant)
   - OnboardingFlow (all steps)
   - DeathScreen
3. Generate baseline snapshots, commit them.
4. Extend existing `OnboardingUITests.swift`: full flow → first game hour → backgrounding → relaunch → continuation.
5. Extend `PurchaseFlowUITests.swift` for sandbox IAP path (if testable without real StoreKit) or document why it's manual.
6. Add `lichunWebsocketUITests/LongSessionUITest.swift` — drive game to in-game age 25 via UI, no crashes.
7. Run full iOS test suite via XcodeBuildMCP after each change.
8. Resolve any iOS issues found during UI tests via sub-agents (one issue per agent).

**Exit Criteria:**
- All snapshot tests have committed baselines
- Full iOS test suite green via `mcp__xcode__RunAllTests`
- Long session UI test passes consistently
- Documented list of any tests intentionally disabled (with reason)

**Risks & Mitigations:**
- *Risk:* Snapshot tests are flaky due to fonts/animations/timestamps. *Mitigation:* Use record-mode controls and inject deterministic clocks; for animated views, capture final state only.
- *Risk:* WebSocket-dependent screens are hard to snapshot. *Mitigation:* Use a local mock WebSocket server in tests, or stub `WebSocketService` with a `Published` test double.

### Phase 4 — Soak & Polish (2–3 sessions)

**Goal:** Find the long-tail bugs that only show up in unusual playthroughs.

**Tasks:**
1. Run simulator with 1000 unique random seeds. Background job. Collect all failures.
2. Triage top 10 failures by frequency × severity.
3. Fix top 10. Re-run. Repeat until 1000-seed run is green.
4. Run iOS UI long-session test on three random seeds.
5. Generate a coverage report (`vitest run --coverage`) for the backend; identify any service module under 60% coverage and add targeted tests for the riskiest gaps.

**Exit Criteria:**
- 1000-seed simulator run: zero exceptions, zero stuck states
- iOS long-session test: green on three seeds
- Backend coverage report committed; no critical service module under 60%

### Phase 5 — TestFlight Submission (1–2 sessions)

**Goal:** Ship a build to TestFlight that you can hand to ten friends.

**Tasks:**
1. Verify crash reporting is wired (Sentry, Crashlytics, or Apple App Store Connect's built-in). If absent, add the lowest-friction option (Apple's built-in).
2. Bump version in `ios/lichunWebsocket.xcodeproj` (build number always, version if appropriate).
3. Verify iOS archive builds via XcodeBuildMCP (`BuildProject` with archive configuration).
4. Upload to TestFlight via `xcrun altool` or Xcode Cloud, depending on existing setup.
5. Write `docs/MVP_TESTFLIGHT_NOTES.md`:
   - What's in this build
   - Known limitations (Android coming later, etc.)
   - What to look for as a tester (specific flows: onboarding, first job, first relationship, first death)
   - How to report issues (email, GitHub Issues, etc.)

**Exit Criteria:**
- Signed build uploaded to TestFlight
- Internal testers list updated (you, primarily; friends invited manually)
- Release notes doc committed
- Final commit on main branch

### Phase 6 — Feedback Loop (post-MVP, ongoing)

Not part of MVP completion. Documented here for continuity:

- Friends play, report issues
- Each issue becomes a simulator scenario or UI test
- Each test fails first, gets fixed, gets committed
- Iteration continues until "fun" is a yes

---

## 5. Best-Game-Practices Checklist

These are the gaming-specific quality bars that will be enforced as part of phase exits:

- **Determinism:** Game state is reproducible from a seed. Save → load → save is identity.
- **No silent failures:** Every error path either recovers gracefully or surfaces visibly.
- **No stuck states:** Game speed never enters paused mode without a corresponding question/event needing input.
- **Backpressure on AI calls:** Rate-limited, with fallbacks. Already in place.
- **Save frequency:** Auto-save weekly + on major events. Already in place.
- **Crash recovery:** On reconnect, player resumes within 1 game minute of last save.
- **Memory bounds:** Conversations capped (already 100 messages), activity records capped (already 50 per person), no unbounded growth.
- **First-run quality:** Onboarding completes without user friction in under 90 seconds.
- **Accessibility floor:** iOS dynamic type respected, color contrast ≥ AA on primary text.

These are checked during Phase 3 (iOS) and Phase 4 (soak).

---

## 6. Out of Scope for MVP

Explicitly deferred:

- Android app updates (v1.1)
- Python backend changes (legacy reference only)
- New event content (200+ events is enough)
- AI conversation system refactoring (already good)
- App Store submission (TestFlight is the bar)
- Marketing site, landing page, store listing copy
- Localization
- iPad-specific layouts
- Watch / TV / Vision Pro

If any of these come up during the work, they get filed as issues, not done.

---

## 7. Risks

| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| Simulator can't fully replicate WebSocket layer | Med | Med | Drop down to PlayerSession-direct simulation; still catches 90% of bugs |
| iOS snapshot tests are flaky | Med | Low | Deterministic clocks, final-state captures only |
| StoreKit IAP can't be tested in CI | High | Low | Document as manual test; smoke test in TestFlight build |
| Sub-agent makes silent breaking change | Med | High | Verify-before-complete discipline; primary always re-runs full suite |
| 1000-seed soak finds an unfixable architectural bug | Low | High | Triage, defer to v1.1 if not blocking happy path |
| TestFlight upload fails due to provisioning | Med | Med | Verify signing certs early in Phase 5, not at the end |

---

## 8. Definition of Done

The plan is done when, on a fresh clone:

```bash
cd server && npm install && npm run build && npm test && npm run e2e
# All green

# In iOS:
# Open Xcode, run all tests via XcodeBuildMCP
# All green

# A signed build is in TestFlight
# docs/MVP_TESTFLIGHT_NOTES.md exists
```

And Craig has the TestFlight invite link to share.

---

## 9. Approval

**Approved by Craig:** 2026-04-13 — chose Option A, instructed to write design, commit, and execute through MVP.

**Next step:** Invoke `superpowers:writing-plans` skill to convert this design into an executable, sub-agent-friendly implementation plan.

---

## 10. Decisions (ADRs)

### ADR-001: Simulator entry point = `PlayerSession` (not WebSocketServer)

**Context:** Phase 1 needs a headless harness that can run full lifetimes in milliseconds. Two options: spin up the full `WebSocketServer` and connect a fake client, or instantiate `PlayerSession` directly with a stub WebSocket.

**Decision:** Instantiate `PlayerSession` directly.

**Rationale:**
- `PlayerSession`'s only `ws` usage is `ws.readyState === WebSocket.OPEN` and `ws.send(data)` — trivially mockable with `{ readyState: 1, send: collector }`.
- An existing test pattern already does this: `server/tests/game/playerSession.regression.test.ts` (`createSession` helper at lines 105-112).
- Skipping the WebSocket layer removes ~50ms of boot per test run and eliminates dispatcher/handler coupling we don't need.
- We drive ticks by calling `(session as any).advanceMinute()` in a loop, bypassing `setInterval` entirely. This makes simulation deterministic and fast.

**Required mocks (minimum viable):**
- `src/database/players.js` → `savePlayer`, `saveConversation` → noop
- `src/database/eventInstances.js` → in-memory store (so v2 event runtime can create/answer/resolve events)
- `src/services/notifications/notificationManager.js` → `notifyRealtimeEvent`, `queueRealtimeNotification`, `clearThrottle` → noop

**NOT mocked:**
- `stats_manager` (age progression, peak energy, etc.)
- `health_manager` (death checks, death chance)
- `intradayActivity` (daily plan, intraday location)
- `events/v2/runtime` (real event engine)
- Event definitions themselves
- Retention integration

The whole point of the simulator is to run real game logic. Mocking it would defeat the purpose.

**Consequence:** Bugs the simulator catches are bugs in real game logic, not mock artifacts. Phase 2 fixes address actual issues.