Data Pipeline & Analytics
Company Y Analytics Platform
A growing e-commerce retailer was making decisions based on stale, siloed data spread across five tools. We built a unified data pipeline and real-time analytics dashboard that consolidated everything into a single source of truth.
5wk
// concept_to_deploy
6
// data_sources_unified
~3hrs
// reporting_time_saved_weekly
1
// source_of_truth
// the_problem
Five tools. Zero visibility.
Sales data lived in Shopify, ad spend in Meta and Google Ads, inventory in a separate warehouse system, and customer data in a CRM — none of them talked to each other.
The team was spending 3+ hours every week manually pulling reports from each platform, pasting them into spreadsheets, and reconciling numbers that never quite matched.
Marketing decisions were being made on week-old data. By the time a campaign was flagged as underperforming, the budget was already spent.
There was no single view of customer lifetime value, acquisition cost, or product margin — the three metrics that actually drive e-commerce profitability.
// measured_impact
Before & after
Before
3+ hrs
Weekly manual reporting across 5 platforms
→
After
0min
Fully automated — dashboard refreshes every 4 hours
Before
7 days
Lag between campaign performance and team awareness
→
After
<4h
Near real-time visibility — act before budget is wasted
Before
0
Unified view of CAC, LTV, and product margin
→
After
Full view
Single dashboard — all core metrics in one place
// system_architecture
How the pipeline is built
The system runs on a 4-layer architecture — from raw source data through transformation and storage, to the analytics layer the team uses daily. Each layer is independently maintainable and can be extended as new data sources are added.
Company Y — Data Pipeline Architecture
L1Data Sources
Shopify Orders API
Meta Ads API
Google Ads API
Inventory System (CSV)
Stripe Payments
HubSpot CRM
↓ scheduled extraction → raw data layer
L2Pipeline & Transformation — Python + dbt
Python extractors
Airflow scheduling
dbt models
Data cleaning
Schema normalization
Incremental loads
↓ transformed data → warehouse
L3Data Warehouse — Supabase + PostgreSQL
Supabase (Postgres)
Unified customer table
Orders & revenue facts
CAC / LTV models
Product margin views
↓ SQL queries → visualization layer
L4Analytics & Visualization — Metabase
Metabase dashboards
Automated alerts
Scheduled email reports
Self-serve queries
Mobile-friendly
// what_we_built
Six dashboards, one source of truth
Revenue overview. Daily, weekly, and monthly revenue with trend lines — refreshed every 4 hours from Shopify and Stripe.
Marketing performance. Blended CAC across Meta and Google, ROAS by campaign and ad set, and spend pacing vs. budget — all in one view.
Customer LTV. Cohort-based lifetime value by acquisition channel — showing which channels bring customers who actually come back.
Product margin. Gross margin per SKU combining Shopify revenue with inventory cost data — surfacing which products are actually profitable.
Inventory alerts. Automated Slack notifications when any SKU drops below reorder threshold — no more stockouts.
Weekly summary email. Automated Monday morning report delivered to the founder's inbox — key metrics, week-over-week changes, and flags that need attention.
// tech_stack
Technology used
Extraction
Python + APIs
Scheduling
Apache Airflow
Transformation
dbt
Warehouse
Supabase
Visualization
Metabase
Alerts
Slack API
// outcomes
What this delivered
3+ hours saved every week. Reporting that was done manually on Fridays is now automated. The team gets better data faster and spends that time on decisions instead.
First-ever product margin view. The team discovered three top-selling SKUs had negative margins after shipping costs. That finding alone justified the entire engagement.
Marketing spend reallocated within week one. With real-time ROAS data, the team cut spend on two underperforming campaigns and shifted budget to their best-performing channel.
Built to extend. Adding a new data source is a new Python extractor and a dbt model — no changes to the core pipeline required.
Want something built like this?
Khoda Consulting designs and ships data pipelines, analytics tools, and AI solutions for growing businesses.