Live Demo · Reference Build
AI Product Development | Document Intelligence
Docflow Document AI
Operations teams lose hours every week manually classifying documents, extracting data, and routing them to the right people. Docflow is a working agentic pipeline that does all three — across 8 document types, in seconds. It’s a capability demo — try it below.
8
Document types classified
Real
Persistent FastAPI + Supabase backend
pgvector
Semantic search built in
Live
Interactive demo — try it now
The Problem
Smart teams. Buried in paperwork.
Operations teams receive hundreds of business documents weekly — contracts, invoices, intake forms, financial statements — and someone has to manually read, categorize, and route every single one.
Critical information — payment terms, renewal clauses, liability caps, overdue amounts, missing fields — is buried in unstructured text. Extracting it manually is slow and error-prone, especially under volume.
Anomalies — auto-renewal traps, past-due invoices, missing consent fields, unusual contract clauses — are frequently missed during high-volume review periods, creating downstream risk.
Routing documents to the right team — legal, finance, HR, compliance — depends on someone reading the document first. That creates a bottleneck at the intake stage before any real work has even started.
What It Does
What it does
Classifies incoming documents across 8 types with confidence scores.
Extracts structured fields and lifts key values into searchable columns.
Flags anomalies and routes high-severity items to a human review queue.
Persists everything with a full audit chain and semantic (pgvector) search.
What We Built
Four stages. One pipeline.
Docflow runs every uploaded document through a four-stage agentic pipeline. Each stage is visible in real time — so users can see the agent's reasoning, not just the output. Supports TXT, PDF, DOCX, and images including scanned documents via vision.
STAGE 01 — CLASSIFY
Document Classification
Identifies the document type across 8 categories — contract, invoice, intake form, HR document, financial statement, healthcare record, real estate document, or general business document — with a confidence score and plain-English summary.
Input: "Commercial Lease Agreement — Harbor Point Properties"
→ Type: Real Estate · 98% confidence
→ Type: Real Estate · 98% confidence
STAGE 02 — EXTRACT
Structured Field Extraction
Pulls 6–10 of the most important structured fields from the document — parties, dates, amounts, IDs, terms, and conditions — and surfaces them as clean, labeled data ready for downstream systems.
Fields extracted: Tenant, Landlord, Monthly Rent, Term, Security Deposit, TI Allowance, Renewal Option...
STAGE 03 — ANALYZE
Anomaly & Risk Flagging
Detects 2–5 anomalies, risks, or missing items worth human attention — auto-renewal traps, overdue payments, uncapped liability clauses, missing consent fields — severity-ranked as high, medium, or low.
High: "No auto-renewal cap — agreement renews indefinitely without written notice"
→ Flagged for legal review
→ Flagged for legal review
STAGE 04 — ROUTE
Downstream Routing Actions
Generates 3–4 concrete next actions with realistic destinations — legal review, AP queue, EHR system, compliance team, finance sign-off — each tagged with a priority level so nothing gets lost in a pile.
Action: "Escalate to Legal for liability clause review" → Legal Team · Urgent
Tech Stack
Technology used
AI Model
Claude Sonnet
Frontend
React + Vite
API Layer
FastAPI (Python)
Frontend Hosting
Vercel
Backend Hosting
Render
File Parsing
PDF.js · Mammoth · Vision
What It Demonstrates
What this demonstrates
A real persistent backend. FastAPI + Supabase with stored results — not a client-side mock.
Strict JSON contract. Classification, ordered field extraction, flags, and routing on every document.
Semantic search built in. Everything processed is embedded and searchable via pgvector.
Human-in-the-loop. High-severity items route to a review queue with a full audit chain.
Handles every format. Digital PDFs, scans (via vision), Word, images, and plain text.
Want something built like this?
Khoda Consulting builds AI-powered products and AI agents for founders and operators — idea to working app in weeks.