Building FinTrack — Gautam Pai

16 phases, 147 API routes, a PWA, a security audit, and a blueprint refactor later — here's everything I learned building a self-hosted personal finance app I actually use every day.

What started as "I want to stop using spreadsheets for expenses" became a six-month engineering project covering budgeting, investments, income tracking, credit cards, debt management, a FIRE calculator, a portfolio stress tester, a Telegram bot, and a Progressive Web App — all in a single self-hosted Flask app.

This post covers the architecture decisions, the hardest engineering problems, the security audit findings, and the production bugs that taught me the most. Everything described is running live at fintrack.gautam-pai.com.

→ Try FinTrack Live

What It Is Now

FinTrack is a complete personal finance operating system. Here's the current feature surface:

Expense Tracker14 categories, split expenses (Splitwise-style), trip ledger, bubble calendar with drill-down, merchant analytics, AI insights, monthly review

Income Tracker8 sources, active vs passive split, annual view, tax estimator (US, India, HK, UAE), cashflow charts

Investment Tracker9 asset types, XIRR, live prices (Finnhub → Yahoo → CoinGecko), broker sync (Zerodha + Groww), research tab with AI analysis

Credit CardsReal card UI, utilisation bar, bill history, calendar view showing statement and due dates, card-linked expenses

FIRE CalculatorAuto-fills from live data, wealth accumulation chart with crossover marker, return scenarios table, what-if sliders

Portfolio Stress Test3 simultaneous sliders (crash %, income loss, expense surge), quick presets, live-updating FIRE delay and cash runway

Spending ForecastWeighted 6-month prediction, confidence band, per-category trend arrows, half-circle gauge

Financial Health Score0–100 composite across 5 pillars: savings rate, debt health, investment health, budget control, emergency fund

The stack: Python 3.10+ / Flask, Supabase (PostgreSQL), vanilla HTML/CSS/JS, Chart.js, Railway hosting, and a Service Worker making the whole thing installable as a PWA.

How It Grew: 16 Phases

Rather than a big-bang build, FinTrack grew incrementally. Each phase was a working, deployed app — nothing sat half-finished. Here's the evolution:

Phase 0Initial build — Flask + local JSON storage. Expense tracker, investment tracker with XIRR, dark theme.

Phase 1Expense overhaul — recurring expenses, CSV export, insights bar, bubble calendar, budget mode.

Phase 1.5Supabase migration — moved from local JSON to PostgreSQL. Service key, .env, gitignore. 4 tables.

Phase 2–2.5Investment overhaul + Net Worth Dashboard — transactions, goals, analytics, rebalance tab, 8 AI insights.

Phase 3–4Currency + auth — 8-currency support with live rates, per-entry currency, invite-only Flask sessions, bcrypt.

Phase 5–6Live prices + UI redesign — 3-provider fallback chain, two-row grouped nav with scroll-collapse, full data export (ZIP of 10 CSVs).

Phase 7PWA — Web App Manifest, Service Worker, 8 icon sizes, install banner, app shortcuts, notch support.

Phase 8Calculators + milestones — FIRE calculator, EMI vs Invest calculator, goal milestones with progress bar ticks, credit cards calendar view.

Phase 9Split expenses — Splitwise-style modal, saved contacts, equal/% split modes, payer selector, Splits Tracker with settle flow.

Phase 10Monthly review + merchant analytics — AI-generated end-of-month report, top merchants with two-stage normalisation.

Phase 11Stress test + lifetime stats — portfolio stress tester with live-updating sliders, lifetime spend stats page.

Phase 12–13Income annual view + tax estimator — full year table, active vs passive split, client-side tax calculator for US/India/HK/UAE.

Phase 14Calendar drill-down + dashboard layout — click any calendar bubble to see all transactions that day with per-day CSV export.

Phase 15Trip ledger — group expenses under a trip, trip-scoped balances, archive/restore, copy summary for sharing.

Phase 16Blueprint refactor + test suite — 5,800-line app.py split into 8 blueprints, global error handlers, pytest suite, dependency pinning.

The Architecture: Thin Helpers, Fat Routes

The entire app goes through five shared Supabase helper functions in common.py — db_rows, db_insert, db_update, db_delete, db_upsert. Every route in all 8 blueprints calls these, never the Supabase client directly. When I needed to change a query pattern, I changed it once.

def db_rows(table, filters=None, order=None, cols="*"):
    q = sb.table(table).select(cols)
    for key, val in (filters or {}).items():
        q = q.eq(key, val)
    if order:
        desc = order.startswith("-")
        q = q.order(order.lstrip("-"), desc=desc)
    return q.execute().data or []

The Phase 16 refactor split a 5,800-line monolithic app.py (147 routes) into 8 blueprint files grouped by feature area — expenses, investments, income_networth, calculators, stocks, brokers, misc_admin, auth. The key constraint: zero behavior changes. I verified with a route-table diff (all 147 routes identical before/after) and a Flask test-client smoke pass across every blueprint.

Lesson: Don't wait until 5,800 lines to refactor. I knew by Phase 6 the file was getting unwieldy. The refactor itself took three days and had two bugs — a blueprint registration ordering issue and a circular import from a decorator that referenced the app object before it was ready. Both were caught by the smoke tests.

Multi-Currency: Store Original, Convert at Display

Every entry — expense, income, investment — is stored with its original currency and amount. No conversion at write time. The base_currency in user_settings applies at read time via a 1-hour in-process cache with a two-provider fallback chain.

The Telegram bot preserves this: "spent 500 on groceries" saves INR 500 unchanged. The LLM prompt explicitly says: "NEVER convert amounts. Save exactly what the user said." This single rule prevented an entire class of data corruption bugs.

The Spending Forecast: Weighted Moving Average

A naive average of past months gives too much weight to a bad month six months ago. I use a weighted moving average where recent behaviour matters more:

# Most recent month = 3×, previous = 2×, older 1× each
weights = [3, 2, 1, 1, 1, 1]  # up to 6 months
forecast = (
    sum(s * w for s, w in zip(spend_values, weights))
    / sum(weights[:len(spend_values)])
)
confidence = (
    "High"   if len(months) >= 4 else
    "Medium" if len(months) >= 2 else
    "Low"
)

The confidence band (±1 standard deviation across historical totals) gives an honest range. Per-category trend arrows (▲ rising / ▼ falling / — stable) make the forecast immediately actionable rather than just a number.

AI Insights: Cache Invalidation Done Right

AI insights via Groq GPT-OSS 120B are expensive to generate — you don't want to hit the API on every page load, but you also can't serve stale insights after new expenses are logged.

Solution: cache the insights in a dict keyed by (user_id, month). On every expense create, update, or delete, that month's cache entry is wiped. Next page load regenerates fresh. The monthly review (AI-generated end-of-month report) uses a separate 1-hour TTL cache since it's intentionally backward-looking and doesn't need to invalidate on individual expense changes.

# In every expense mutation endpoint:
invalidate_insights_cache(uid, month)

# Cache lookup:
cached = _insights_cache.get((uid, month))
if cached and time.time() - cached["ts"] < 3600:
    return jsonify(cached["data"])

Merchant Analytics: Two-Stage Normalisation

Grouping expenses by merchant sounds simple — until you realise "Dollar tree", "Dollar Tree", and "Dollar Tree #247" are all the same place. I built a two-stage normalisation before grouping:

Stage 1 — Case merge: all descriptions lowercased for comparison. The display name is taken from the form with the highest total spend (so "ALDI" wins over "aldi" if you spend more when using caps).

Stage 2 — Word-boundary prefix merge: if a shorter name is a strict word-boundary prefix of a longer one ("Aldi" vs "Aldi Junk"), the longer is absorbed into the shorter. This catches chain names that vary by store number or suffix without needing regex.

No schema changes — this runs entirely over the existing description field. The same logic is reused on the Lifetime Spend Stats page.

The Trip Ledger: Shared Splits Across Pages

Adding the Trip Ledger in Phase 15 forced me to solve a real engineering problem: the split expense logic (modal, balance ledger, settle flow) existed entirely in expenses.js and expenses.html. Duplicating it for trips would mean maintaining two implementations of the same complex logic.

Solution: extracted everything into static/js/splits-shared.js and shared CSS in main.css. Both the Expenses page and the Trips page import the same module. One implementation, two contexts. The settle endpoint was also extended to accept an optional trip_id — without it, settling clears debts globally; with it, it only clears debts on that specific trip.

Bug this caught: Before the trip-scoped settle, clicking "Settle all with Alice" on a trip accidentally cleared Alice's debts on every other trip and all standalone expenses too. The fix was one parameter — but the impact of not having it would have been silent data corruption.

The PWA: Service Worker Strategy

Making FinTrack installable as a PWA required careful thought about what to cache and what not to. The strategy:

Cache-first for static assets (/static/) — JS, CSS, icons served from cache instantly on repeat visits. Versioned filenames ensure stale cache is never a problem.

Network-first for HTML pages — always try the network; fall back to a cached shell if offline. This means users see real data when online and a graceful offline page when not.

Pass-through for everything else — cross-origin requests (Google Fonts, Chart.js CDN) are never intercepted. This avoids opaque-response cache issues entirely.

self.addEventListener('fetch', event => {
    const url = new URL(event.request.url);

    // Pass through cross-origin requests entirely
    if (url.origin !== location.origin) return;

    // Cache-first for static assets
    if (url.pathname.startsWith('/static/')) {
        event.respondWith(cacheFirst(event.request));
        return;
    }

    // Network-first for everything else
    event.respondWith(networkFirst(event.request));
});

Security: Two Full Audits

FinTrack has gone through two full security audits (latest: May 2026). The findings shaped several non-obvious design decisions.

CSRF protection without tokens: All state-changing API routes require the X-Requested-With: XMLHttpRequest header. The shared apiFetch() utility sends this automatically. This covers all AJAX calls — HTML form submissions (login, register, password reset) are protected by rate limiting and token validation instead.

IDOR protection: Every DB write filters by user_id from the server session. Card bill routes additionally verify card ownership before any operation. There's no "trust the client-supplied ID" anywhere in the codebase.

Admin role re-queried every time: The is_admin() check re-queries the DB on every admin request. The session role is never trusted — a demoted admin is locked out on their next request, not their next login.

Telegram ID linking via one-time code: Previously, linking a Telegram account accepted a bare Telegram ID with no verification — anyone could claim someone else's ID. Fixed in Phase 16 with a link-code flow: the app generates a one-time code the user sends from their own Telegram account, proving ownership before the ID is stored.

Security note: The in-process rate limiter (login: 10/5min, registration: 5/5min) resets on server restart. For multi-worker production deployments, replace with Redis-backed limiting via Flask-Limiter. For single-dyno Railway deployments it's sufficient.

The Production Bug That Taught Me to Pin Dependencies

After Phase 16 I pinned every dependency to exact versions in requirements.txt. The reason: a deploy crashed on boot with:

TypeError: Client.__init__() got an unexpected keyword argument 'proxies'

groq==0.11.0 passes a proxies= kwarg to httpx.Client() which was removed in a newer httpx version. It worked locally because httpx wasn't pinned and my local environment had resolved a compatible old version. The deploy container resolved a newer incompatible one.

Fix: upgrade to groq==1.5.0 (which removed that kwarg) and pin httpx==0.27.2 explicitly. The lesson is simple: unpinned dependencies are a time bomb. The failure mode is always "worked locally, broke in prod" and always at the worst time.

The Tax Estimator: Fully Client-Side

The tax estimate feature covers US (federal 2024 brackets + all 50 states + DC), India (New Regime and Old Regime, FY 2024-25, including 87A rebate and 4% cess), Hong Kong (2024-25 salaries tax), and Dubai/UAE (zero tax). All calculation logic runs entirely in the browser — no API calls, no server round trips.

The interesting part is the US state logic: 9 no-tax states, 14 flat-rate states, and the rest with full marginal brackets — all encoded as a lookup table with a common calculation function. The annual income auto-populates from the Annual View endpoint for the selected year and stays editable.

What I'd Do Differently

Start with blueprints. The Phase 16 refactor was three days of careful work that should have been baked in from Phase 3. A single app.py works for a weekend project. It doesn't work for 147 routes.

Pin dependencies from day one. The production boot crash was entirely avoidable. Requirements files without exact pins are documentation, not specifications.

True XIRR with Newton-Raphson. The current CAGR approximation handles single lump-sum investments fine but is wrong for SIPs and multiple tranches. The investment transactions table now has enough data to implement proper cashflow-based XIRR — it's the next planned feature.

Background price refresh. Prices update on button click. A background job refreshing on a schedule would make the investment dashboard feel live without user interaction.

Current Status

FinTrack is complete and deployed. Every major feature is live. The two remaining items on the roadmap are STCG/LTCG tax summary (using actual buy/sell transaction history) and proper recurring investment (SIP) support in the transaction tracker.

The app is self-hosted — the demo page is open with no login required. The code structure, security model, and API surface are all documented in the README.

Building FinTrack: From Expense Tracker to Personal Finance OS

What It Is Now

How It Grew: 16 Phases

The Architecture: Thin Helpers, Fat Routes

Multi-Currency: Store Original, Convert at Display

The Spending Forecast: Weighted Moving Average

AI Insights: Cache Invalidation Done Right

Merchant Analytics: Two-Stage Normalisation

The Trip Ledger: Shared Splits Across Pages

The PWA: Service Worker Strategy

Security: Two Full Audits

The Production Bug That Taught Me to Pin Dependencies

The Tax Estimator: Fully Client-Side

What I'd Do Differently

Current Status