ML / Predictive Model

Company Z Delivery Predictor

A regional logistics operator was absorbing significant cost from failed delivery attempts — drivers arriving at addresses where no one was home, wrong addresses, or access issues. We built a machine learning model that flagged at-risk deliveries before dispatch so the team could intervene proactively.

Client: Company Z Khoda Consulting Python · scikit-learn · FastAPI PostgreSQL February 2026
38%
// failed_attempts_reduced
6wk
// concept_to_deploy
91%
// model_precision
$0
// third_party_ml_platform

Every failed delivery costs twice.

Company Z was processing 800–1,200 deliveries per day across a regional network. Roughly 14% were failing on first attempt — meaning a driver visit, a failed attempt, a redelivery, and an unhappy customer.
Each failed attempt cost the business in fuel, driver time, and redelivery logistics — estimated at $8–12 per failed stop. At 14% failure rate, that was adding up to over $10,000 per month in avoidable cost.
The operations team had no predictive signal — failures only became visible after they happened. There was no way to prioritize or intervene before dispatch.
Historical delivery data existed but was siloed in a legacy system — never analyzed, never used to inform routing or scheduling decisions.

Before & after

Before
14%
First-attempt delivery failure rate
After
8.6%
38% reduction in failed attempts within 60 days
Before
$0
Predictive signal — failures only visible after the fact
After
91%
Precision on at-risk delivery flags before dispatch
Before
~$10k
Monthly cost from failed delivery attempts
After
~$6k
~$4,000/month saved in redelivery costs

What the model learns from

We built and trained a gradient boosting classifier on 18 months of historical delivery data — 340,000+ delivery records with outcomes. The model uses eight feature categories to score each delivery before dispatch.


Feature 01
Address History
Prior delivery success rate at the exact address and surrounding area — the strongest single predictor.
Feature 02
Time Window
Requested delivery window vs. historical success rates by time-of-day and day-of-week for that zone.
Feature 03
Customer History
The recipient's own delivery success rate across all prior orders — repeat offenders are predictable.
Feature 04
Package Type
Signature-required and oversized packages fail at significantly higher rates — encoded as model features.
Feature 05
Zone Density
Delivery zone characteristics — apartment buildings, gated communities, and commercial addresses each have distinct patterns.
Feature 06
Weather Signal
Adverse weather correlates with both driver delays and recipient unavailability — integrated via weather API.

How the model performs

Evaluated on a held-out test set of 40,000 deliveries the model had never seen. We optimized for precision over recall — a false positive (flagging a delivery that would have succeeded) is less costly than a false negative (missing an at-risk delivery).


91%
Precision
84%
Recall
0.94
AUC-ROC
87%
F1 Score

From prediction to action

1
Nightly batch scoring
scheduler → extract_deliveries() → score_batch()
Every evening, the next day's delivery manifest is extracted from the operations system and passed through the model. Each delivery gets a risk score from 0–100.
2
At-risk deliveries flagged
threshold = 0.72 → flag_high_risk(deliveries)
Deliveries scoring above the 0.72 probability threshold are flagged as high-risk. The threshold was tuned on validation data to balance precision and operational capacity.
3
Operations team notified
send_daily_brief(flagged_deliveries)
A morning brief is sent to the operations team listing flagged deliveries with their risk scores and top contributing factors. The team can proactively contact recipients, reschedule, or assign experienced drivers.
4
Outcomes feed back into the model
log_outcome() → retrain_monthly()
Every delivery outcome is logged and the model is retrained monthly on the latest data. It gets sharper over time as it learns from the current business's patterns.

Technology used

Modeling
scikit-learn
Language
Python
API
FastAPI
Database
PostgreSQL
Scheduling
Airflow
Hosting
Vercel + AWS

What this delivered

38% reduction in failed delivery attempts. Within 60 days of deployment, first-attempt success rate improved from 86% to 91.4% — directly measurable in the operations data.
~$4,000/month in redelivery cost saved. Conservative estimate based on per-failed-stop cost and volume. ROI positive within the first month of operation.
Operations team has a morning brief they actually use. The daily flagged-deliveries report became a core part of the morning standup within two weeks of launch.
No third-party ML platform costs. The model runs on infrastructure the client already had — no $2,000/month SaaS subscription required.
Self-improving over time. Monthly retraining means the model gets sharper as it accumulates more outcome data — the value compounds.
case study

Operational data you're not using?

We build predictive models for logistics, operations, and supply chain teams — purpose-built for your data.

Let's talk →

Want something built like this?

Khoda Consulting designs and ships ML models, data pipelines, and AI solutions for growing businesses.

Start a conversation →