ML / Predictive Model

Company Z Delivery Predictor

A regional logistics operator was absorbing significant cost from failed delivery attempts — drivers arriving at addresses where no one was home, wrong addresses, or access issues. We built a machine learning model that flagged at-risk deliveries before dispatch so the team could intervene proactively.

Client: Company Z Khoda Consulting Python · scikit-learn · FastAPI PostgreSQL February 2026

Work with us

38%

// failed_attempts_reduced

6wk

// concept_to_deploy

91%

// model_precision

// third_party_ml_platform

// the_problem

Every failed delivery costs twice.

Company Z was processing 800–1,200 deliveries per day across a regional network. Roughly 14% were failing on first attempt — meaning a driver visit, a failed attempt, a redelivery, and an unhappy customer.

Each failed attempt cost the business in fuel, driver time, and redelivery logistics — estimated at $8–12 per failed stop. At 14% failure rate, that was adding up to over $10,000 per month in avoidable cost.

The operations team had no predictive signal — failures only became visible after they happened. There was no way to prioritize or intervene before dispatch.

Historical delivery data existed but was siloed in a legacy system — never analyzed, never used to inform routing or scheduling decisions.

// measured_impact

Before & after

Before

14%

First-attempt delivery failure rate

→

After

8.6%

38% reduction in failed attempts within 60 days

Before

Predictive signal — failures only visible after the fact

→

After

91%

Precision on at-risk delivery flags before dispatch

Before

~$10k

Monthly cost from failed delivery attempts

→

After

~$6k

~$4,000/month saved in redelivery costs

// model_features

What the model learns from

We built and trained a gradient boosting classifier on 18 months of historical delivery data — 340,000+ delivery records with outcomes. The model uses eight feature categories to score each delivery before dispatch.

Feature 01

Address History

Prior delivery success rate at the exact address and surrounding area — the strongest single predictor.

Feature 02

Time Window

Requested delivery window vs. historical success rates by time-of-day and day-of-week for that zone.

Feature 03

Customer History

The recipient's own delivery success rate across all prior orders — repeat offenders are predictable.

Feature 04

Package Type

Signature-required and oversized packages fail at significantly higher rates — encoded as model features.

Feature 05

Zone Density

Delivery zone characteristics — apartment buildings, gated communities, and commercial addresses each have distinct patterns.

Feature 06

Weather Signal

Adverse weather correlates with both driver delays and recipient unavailability — integrated via weather API.

// model_performance

How the model performs

Evaluated on a held-out test set of 40,000 deliveries the model had never seen. We optimized for precision over recall — a false positive (flagging a delivery that would have succeeded) is less costly than a false negative (missing an at-risk delivery).

91%

Precision

84%

Recall

0.94

AUC-ROC

87%

F1 Score

// how_it_works

From prediction to action

Nightly batch scoring

scheduler → extract_deliveries() → score_batch()

Every evening, the next day's delivery manifest is extracted from the operations system and passed through the model. Each delivery gets a risk score from 0–100.

At-risk deliveries flagged

threshold = 0.72 → flag_high_risk(deliveries)

Deliveries scoring above the 0.72 probability threshold are flagged as high-risk. The threshold was tuned on validation data to balance precision and operational capacity.

Operations team notified

send_daily_brief(flagged_deliveries)

A morning brief is sent to the operations team listing flagged deliveries with their risk scores and top contributing factors. The team can proactively contact recipients, reschedule, or assign experienced drivers.

Outcomes feed back into the model

log_outcome() → retrain_monthly()

Every delivery outcome is logged and the model is retrained monthly on the latest data. It gets sharper over time as it learns from the current business's patterns.

// tech_stack

Technology used

Modeling

scikit-learn

Language

Python

API

FastAPI

Database

PostgreSQL

Scheduling

Airflow

Hosting

Vercel + AWS

// outcomes

What this delivered

38% reduction in failed delivery attempts. Within 60 days of deployment, first-attempt success rate improved from 86% to 91.4% — directly measurable in the operations data.

~$4,000/month in redelivery cost saved. Conservative estimate based on per-failed-stop cost and volume. ROI positive within the first month of operation.

Operations team has a morning brief they actually use. The daily flagged-deliveries report became a core part of the morning standup within two weeks of launch.

No third-party ML platform costs. The model runs on infrastructure the client already had — no $2,000/month SaaS subscription required.

Self-improving over time. Monthly retraining means the model gets sharper as it accumulates more outcome data — the value compounds.

Want something built like this?

Khoda Consulting designs and ships ML models, data pipelines, and AI solutions for growing businesses.

Start a conversation →

Company Z Delivery Predictor

Every failed delivery costs twice.

Before & after

What the model learns from

How the model performs

From prediction to action

Technology used

What this delivered

Operational data you're not using?

Want something built like this?