Full-Stack Python · Flask Finance · AI LLM May 2026 · 15 min read

Building FinTrack: A Full-Stack Personal Finance OS in Flask

From a simple expense tracker to a complete financial operating system — XIRR, live prices, AI insights, spending forecasts, debt tracking, and a Telegram bot, all in one self-hosted app.

What started as "I want to stop using spreadsheets" turned into a six-month engineering project I use every single day. FinTrack is now a full personal finance OS — not just an expense tracker, but a complete system for understanding where your money is, where it's going, and what to do about it.

This post covers the architecture, the interesting engineering problems, the mistakes, and what I'd do differently. Everything is based on real code running at fintrack.gautam-pai.com.

→ Try FinTrack Live

What It Does Now

The scope has grown significantly since the first version. Here's the full feature surface:

Net Worth DashboardKPI strip, portfolio XIRR, allocation donut, 8 AI-generated insights, monthly scorecard (A+ to F grade)
Expense Tracker12 dynamic categories, recurring templates, bubble calendar, AI spending insights via Groq LLaMA 3.3 70B
Income TrackerSalary, freelance, dividends — cashflow view, recurring income templates, MoM comparison
Investment Tracker9 asset types, XIRR, live prices, broker sync (Zerodha + Groww), Research tab with AI analysis
Spending ForecastWeighted 6-month prediction, confidence level, per-category trend arrows, half-circle gauge
Financial Health Score0–100 composite score across 5 pillars: savings rate, debt health, investment health, budget control, emergency fund
Debt TrackerStandard EMI, Flexi/Tranche, Overdraft — payment schedules, urgency colours, utilisation tracking
Telegram BotNatural language logging, weekly digest, /summary, /scorecard, /streak, /undo — Groq powered

The stack: Python 3.10+ / Flask, Supabase (PostgreSQL), vanilla HTML/CSS/JS, Chart.js, Railway for hosting.


The Database Layer: Five Helper Functions

Every route goes through five thin Supabase wrapper functions — db_rows, db_insert, db_update, db_delete, db_upsert. This pattern made the entire codebase consistent and refactoring painless. When I switched a query from .eq() to .in_() globally, I changed one function.

def db_rows(table, filters=None, order=None, cols="*"):
    q = sb.table(table).select(cols)
    for key, val in (filters or {}).items():
        q = q.eq(key, val)
    if order:
        desc = order.startswith("-")
        q = q.order(order.lstrip("-"), desc=desc)
    return q.execute().data or []
Bug I hit: Using the anon key in the Telegram bot caused RLS policy conflicts — the bot wrote rows where user_id didn't match the authenticated session. Fix: service key in the bot process, which bypasses RLS. Gated by a DB lookup of the Telegram ID against user_settings.

Multi-Currency: Store Original, Convert at Display

Every entry — expense, income, investment — is stored with its original currency and amount. No conversion at write time. The base_currency in user_settings is applied at read time using a 1-hour in-process cache with a two-provider fallback chain (open.er-api.com → Frankfurter).

The Telegram bot preserves this: "spent 500 on groceries" saves INR 500, not a converted USD value. The LLM prompt explicitly says: "NEVER convert amounts. Save exactly what the user said."


Spending Forecast: Weighted Prediction

The forecast page is one of the most useful features. Instead of a simple average of past months, I use a weighted prediction where recent behaviour matters more:

# Weights: most recent month = 3x, previous = 2x, older = 1x each
weights = []
for i, month in enumerate(reversed(months)):
    if i == 0:
        weights.append(3)
    elif i == 1:
        weights.append(2)
    else:
        weights.append(1)

forecast = sum(s * w for s, w in zip(spend_values, weights)) / sum(weights)
confidence = "High" if len(months) >= 4 else "Medium" if len(months) >= 2 else "Low"

The variance band (±1 standard deviation) gives an honest range, and the current month pace projection shows exactly where you'll land vs the forecast. Per-category trend arrows (▲ rising / ▼ falling / — stable) make it immediately actionable.


Financial Health Score: Five Pillars

The health score is a composite 0–100 metric across five pillars, each graded A+ to F:

Savings Rate (25 pts)Income vs expenses — are you actually saving?
Debt Health (20 pts)Debt-to-income ratio — is your debt load manageable?
Investment Health (25 pts)Portfolio return + diversification across asset classes
Budget Control (15 pts)Spend vs monthly budget — are you staying on track?
Emergency Fund (15 pts)Months of expenses covered by safe assets (FD/PPF/Bonds)
OutputAnimated arc dial, pillar cards, prioritised recommendations

The score is entirely derived from the user's own data — no manual input. Every pillar recalculates on the fly from expenses, income, investments, and debts already in the system.


The Investment Research Tab

The Research tab grew into something I didn't originally plan but now use constantly. For any ticker across US, Indian, or crypto markets:

Fundamental data — P/E, P/B, 52-week range, EPS, dividend yield, ROE, net margin, beta pulled from Finnhub and Yahoo Finance. Analyst consensus — colour bar showing buy/hold/sell counts with an overall verdict. Price target — mean, high, low, median with upside % from current price. Earnings history — EPS estimate vs actual with beat/miss/in-line badge. AI analysis — sends the full fundamentals to Groq LLaMA 3.3 70B and gets back a structured 6-section report.

The watchlist persists to the database — tickers saved on any device show up everywhere with live prices in a sticky sidebar.


Debt Tracker: Three Loan Structures

Most finance apps treat all debt as one thing. Real loans aren't. I built three distinct structures:

Standard EMI — fixed monthly repayment for home, car, personal loans. Amortisation schedule auto-calculated from principal, rate, and tenure.

Flexi / Tranche — for education and construction-linked loans where disbursement happens in stages. Tracks sanctioned amount vs total disbursed separately, with EMI phases: moratorium EMI → full repayment EMI, with a changeover date.

Overdraft / Credit Card — revolving credit with utilisation tracking. Credit limit vs current balance, utilisation %, and urgency colours for payment due dates.


Live Price Fetching: Three-Provider Fallback

The fallback chain — Finnhub → Yahoo Finance → CoinGecko — handles three market types with different quirks:

Indian stocks need exchange prefixes: RELIANCE won't resolve — you need NSE:RELIANCE on Finnhub or RELIANCE.NS on Yahoo. Auto-appended as candidates.

Yahoo Finance blocks default User-Agents. Setting a browser UA string fixed it at personal use volume.

Crypto on Finnhub requires exchange-prefixed symbols: BINANCE:BTCUSDT, COINBASE:BTC-USD. Four candidate formats tried automatically.

Groww broker sync was added alongside Zerodha — one-click sync pulls all holdings from both brokers into the investments table via their respective APIs.


The Telegram Bot: Groq LLaMA for NLP

The bot classifies every message into expense, income, query, or unknown using Groq LLaMA 3.3 70B at temperature=0 — deterministic JSON output every time. Strip markdown fences before parsing; the model wraps output in backticks even when told not to.

New additions since the first version: /digest sends a full weekly financial summary, /scorecard returns your monthly grade, and a weekly digest auto-sends every Sunday at 9 AM UTC using the job queue scheduler built into python-telegram-bot.

New: Income logging now works the same way as expenses — "received salary 80000" logs directly to the income table with source auto-detected.

AI Spending Insights: Cache Invalidation Done Right

The AI insights on the expense page are generated by Groq on demand. The expensive part is the API call — I didn't want to hit Groq on every page load, but I also didn't want stale insights after new expenses were logged.

Solution: cache the insights in the database keyed by (user_id, month) with a generated-at timestamp. On every expense create/update/delete, the cache row for that month is deleted. Next page load regenerates. This means insights are always fresh relative to your data, not time-based.

# Invalidate cache on any expense change
def invalidate_insights_cache(uid, month):
    db_delete("ai_insights_cache", {
        "user_id": uid,
        "month": month
    })

What I'd Do Differently

True XIRR with Newton-Raphson. The current CAGR approximation works for single lump-sum holdings but doesn't correctly handle SIPs or multiple buy tranches. The transaction table now has enough data to implement proper cashflow-based XIRR — it's next.

Background job queue for price refreshes. Currently prices update on button click. A Celery or APScheduler background job refreshing prices every 15 minutes would make the investment dashboard feel live without user interaction.

Rate limiting on AI endpoints. The insights, research analysis, and health score recommendations each make Groq API calls. Under concurrent users this could get expensive. A short Redis cache keyed by content hash would eliminate redundant calls.

Separate the Telegram bot cleanly. The bot and web app share environment variables and a copy-pasted Supabase client init. A proper shared library package would eliminate the duplication.


Running It Yourself

FinTrack is self-hosted. You need a free Supabase project (run schema.sql), a Finnhub API key, a Groq API key, and a Telegram bot token from BotFather. Railway handles deployment via the included Procfile.

Or just try the live version — the demo is open.

G
Gautam Pai
MS student in Business Analytics & AI at UT Dallas. Previously at Cognizant building data pipelines and analytics systems. Likes building things that are actually used.