Back to work

Vision + OCR document AI

Premier Scan

A logistics invoice processing workflow that combines vision models, OCR, fuzzy matching, and structured exports for document-heavy operations.

80% efficiency gain45-column CSV export85% fuzzy matching
Vision + OCR document AIPublic architecture view with sensitive client data removed
System flowInputs, tools, model routing, storage, and review states
Production controlsAuth, cost tracking, audit logs, and human review

Problem

Operations teams had to manually extract fields from shipping invoices and reconcile them into structured logistics formats.

Approach

  • Combined OCR and vision extraction so low-quality scans could be cross-checked from multiple signals.
  • Normalized currencies, countries, and carrier fields before export.
  • Used fuzzy matching thresholds to align extracted charges with expected charge-code mappings.

Architecture

  • PDF or image input is converted into processable page images.
  • OCR and vision extraction produce candidate fields.
  • Normalization and matching rules clean extracted data.
  • Exports are generated for downstream logistics workflows.

Production

  • Per-document cost tracking keeps vision-model usage visible.
  • Public examples use synthetic or heavily redacted invoice visuals.
  • Human review remains available for low-confidence fields.

Result

The workflow reduced repetitive data entry and made document extraction more auditable.

Confidentiality note: Raw invoice screenshots are not included in the public build. Use synthetic visuals or heavily redacted media only.

Stack

PythonFlaskAzure OpenAIGPT-4o VisionTesseract OCRSQLiteDocker