A legacy Flutter application for tracking real estate wealth. Firebase-coupled, untyped, with a silent data model bug that had been corrupting cash flow calculations since the first version shipped. The founding team knew the numbers felt wrong. They could not prove it. Three Phoenix Runtime agents ran a full archaeological survey, rebuilt the stack from the ground up in React 18 and Dexie.js, produced 51 passing tests, and issued A-06 certification. The app is live. The numbers are right.
Pipeline via Phoenix Runtime — 7-Agent Modernization™
The application was a Flutter-based real estate wealth tracker built by a three-person founding team. It tracked investment properties, mortgages, rent income, and a portfolio overview across CAD and USD. It worked well enough to use. It did not work well enough to trust.
The core problem was structural. The original stack had no TypeScript, no unit tests, and no separation between data access and display logic. Firebase provided authentication and storage — a coupling that meant no offline access, no data portability, and no ability to run the application without an active cloud connection. The founding team had chosen Firebase for speed. The debt compounded quietly.
Additional property expenses carried a frequencyFactor field intended to normalise recurring payments to monthly equivalents. One-time expenses stored frequencyFactor: 0. The division operation used this value as a divisor. Division by zero returned NaN, which propagated silently through every cash flow calculation upstream. The cash flow total shown in the portfolio overview was wrong by a compounding factor. The team knew the numbers felt off. The bug had no error, no warning, and no test to catch it.
Impact: portfolio cash flow total corrupted — silent, no stack trace.The original codebase carried no TypeScript types. Property data, borrowing records, and expense entries were passed as untyped maps. The archaeological survey found that the typed store name for cascade deletes in the new system — db.additionalExpenses — had been written as the string 'additionalProperties' in the original. A category error invisible without types. The absence of a type surface meant every refactoring decision had to be made against runtime behaviour rather than schema guarantees.
Impact: cascade delete silently targeting wrong store — data integrity risk.The original application required Firebase authentication to function. All user data lived in Firestore. There was no export mechanism, no local fallback, and no way to use the application without an active cloud session. For a personal finance tool handling investment property data, this was a structural misfit: the data is sensitive, the use is personal, and the dependency on an external service introduced both a privacy surface and a single point of failure the user could not control.
Impact: no offline access, no data portability, cloud dependency on sensitive personal data.“The numbers feel wrong but I can’t prove it.”
— Founding team, pre-modernisationPhoenix Runtime deploys a sequential seven-agent pipeline. Each agent has a defined scope and produces structured outputs that downstream agents consume. For this engagement, three agents ran: A-04 (Archaeological Survey), A-05 (Engineering, two passes), and A-06 (Validator/Certifier). The earlier agents — A-00 through A-03 — were skipped via an episode declaration that recorded the stack pivot decision as first-class pipeline state.
A-04 catalogued the full original codebase: data model, Firebase schema, screen inventory, business logic distribution, and known bugs. The survey produced the definitive finding that the frequencyFactor division-by-zero was the root cause of cash flow corruption. It identified the typed store name mismatch. It documented the LTV calculation producing NaN when market value was zero. It confirmed that no unit test surface existed anywhere in the project. The survey output became the A-05 work brief.
Output: full bug inventory, data model map, zero test surface confirmed.A-05 ran twice. Pass 4 extracted normaliseToMonthly(amount, frequency) as the single source of truth for payment frequency normalisation — replacing the division-by-zero with an explicit 0 return for one-time expenses. It extracted calculateLtv(totalBorrowed, marketValue) as a pure function with a zero-division guard. Both functions were placed in dedicated modules, testable in isolation. Pass 5 built the test suite: 51 tests across 4 files using Vitest and fake-indexeddb, covering normalisation, LTV, export/import round-trips, and overview calculations. All 51 passed before A-06 ran.
Output: normaliseToMonthly(), calculateLtv(), 51 tests passing, 0 failures.A-06 ran as an independent agent with no prior context from A-04 or A-05 — a deliberate constraint that prevents the certifier from inheriting the builder’s assumptions. It reviewed the full codebase, ran the test suite, and issued five findings: the exchange rate direction convention needed documentation; stub data used frequencyFactor: 1 for annual expenses (should be 12); the useProperties() hook returned oldest-first; the vite.config.ts import was not from vitest/config; email validation was absent on the profile screen. All five were non-blocking. All five were fixed before deployment. Certification issued.
Output: A-06 CERTIFIED — 5 non-blocking findings, all resolved.The rebuilt application runs on React 18, TypeScript, Vite 5, Tailwind CSS, Zustand, and Dexie.js. All data lives in IndexedDB on the user’s device. There is no authentication layer, no cloud dependency, and no data leaving the browser. The founding team’s original instinct — that a personal finance tool should be personal — is now architectural fact.
The local-first decision shaped every subsequent choice. Dexie.js provides a typed IndexedDB wrapper with reactive hooks. Data is exported as a single JSON file and imported with schema validation. The exchange rate between CAD and USD is user-controlled and stored locally. Address lookup uses Nominatim/OpenStreetMap with a 1000ms debounce and no API key. The application works offline, on a plane, without a Cloudflare edge, without a Firebase project, and without an account.
-- ep-001.sil — Stack Pivot Episode Declaration
EPISODE "stack-pivot"
DATE "2026-04"
AFFECTS A-04, A-05, A-06
SKIPS A-00, A-01, A-02, A-03
-- Original: Flutter + Firebase + Dart
-- Rebuilt: React 18 + Dexie.js + TypeScript + Vite 5
-- Reason: Local-first architecture. No auth. No cloud coupling.
-- The data is personal. The stack should reflect that.
NOTE "A-04 reads the original Flutter codebase as archaeological source."
NOTE "A-05 builds against the new stack from scratch."
NOTE "A-06 certifies the new build independently."
-- .phoenix/state.json (abridged)
{
"project": "wealth2track",
"agents": [
{ "id": "a-04", "name": "Archaeologist",
"status": "complete", "confidence": "high",
"outputCount": 3 },
{ "id": "a-05", "name": "Engineer",
"status": "complete", "confidence": "high",
"outputCount": 5 },
{ "id": "a-06", "name": "Validator",
"status": "certified", "confidence": "high",
"findings": 5, "blocking": 0 }
]
}
Runtime: github.com/semanticintent/phoenix-runtime · Pipeline: semanticintent.ai/pipeline · Live: w2t.semanticintent.ai
The frequencyFactor division-by-zero produced no error, no warning, and no stack trace. It produced wrong numbers — plausible-looking, slightly-off numbers that a user would notice only by feel. The founding team noticed. They had no mechanism to prove it. A-04’s archaeological mandate — read everything, document everything, trust nothing — is specifically designed to surface this class of bug. You cannot fix what you cannot locate. You cannot locate what you cannot name.
A-05 produced normalised functions, a corrected data model, a full React rebuild, and 51 tests. The tests are the most durable output. The functions will be refactored. The data model will evolve. The tests will catch both. A codebase with zero tests is a codebase where every change is a bet. A-05’s second pass exists specifically to convert the bet into a guarantee. The 51 tests are not a metric. They are the mechanism by which future changes can be made without fear.
Firebase was chosen for speed. The debt was privacy, portability, and offline access. A personal finance application — tracking investment properties, mortgage balances, and net worth — handles data that users should own. Local-first with Dexie.js and IndexedDB means the data lives on the user’s device, exports to a JSON file they control, and requires no account to access. The architecture is not a technical preference. It is the correct answer to the question: whose data is this?
A-06’s five findings were not failures of A-05 — they were findings that A-05 could not structurally produce. The stub annual expense error required looking at the data model with fresh eyes. The email validation gap required asking “what is missing” rather than “what is broken.” The certifier’s independence constraint is not a formality. It is the mechanism that makes certification mean something. A builder who certifies their own work is auditing their own assumptions. That is not an audit.
Phoenix Runtime runs the same pipeline on any legacy codebase. Three agents. One engagement. A-06 certification before deployment.