Case Study: Scaling an AI-Powered eReceipt Intelligence Platform

Executive Summary

At Fetch Rewards, a consumer rewards app with 10M+ users, I led the product strategy and cross-functional delivery of an in-house AI-powered receipt intelligence platform that replaced a high-cost third-party vendor. The solution improved product match accuracy across thousands of retailers, restaurants, and fast food chains, reduced operational overhead, and delivered multimillion-dollar annual cost savings while supporting 10M+ users and 600+ brand partners.

The Business Problem

Fetch's receipt processing was handled by a third-party vendor, creating major issues:

High recurring operational cost: vendor pricing scaled with receipt volume, increasingly expensive as the user base grew
Limited control over accuracy and roadmap: no ability to iterate on extraction logic or prioritize improvements that mattered to our business
Slow iteration cycles: every improvement request or addition of a new partner/retailer went through the vendor's timeline, not ours
Poor customer support: vendor dependency created bottlenecks in resolving user-facing issues quickly

The business needed higher accuracy, faster iteration, lower cost, and scalable infrastructure. This case study covers the product strategy and leadership perspective. For the technical deep-dive into the ML system, see Digital Receipt Information Extraction System.

Product Vision

Build a scalable, in-house receipt intelligence platform that:

• Extracts structured purchase data from digital receipts (eReceipts)
• Improves match accuracy and reward attribution
• Improves customer experience
• Reduces vendor dependency and cost
• Integrates seamlessly into the mobile app experience
• Enables future experimentation and personalization

The goal was not just technical replacement, it was strategic product ownership.

Key Stakeholders

The receipt processing flow spans the full user journey: a user connects their email account, the app retrieves eReceipts via IMAP, the extraction service parses structured data from HTML, the matching system maps items to a product catalog, and rewards are attributed to the user and reported to brand partners.

This required coordination across Product Management, Mobile Engineering, Backend Engineering, MLOps/SRE, QA, Brand Partnerships, and Operations. Downstream teams in analytics, reporting, and partner management depended on the quality of extracted data.

My role was to integrate this system into Fetch's receipt processing flow without breaking downstream services, user experience, or revenue.

Defining Success Metrics

Before building, aligned on measurable KPIs across three dimensions:

Accuracy Metrics

• Product match accuracy
• Extraction precision & recall

Business Metrics

• Reward attribution accuracy
• Points awarded accuracy
• Offer redemption accuracy
• Partner reporting reliability
• Retailer-level coverage and volume
• Cost per receipt processed

Operational Metrics

• Processing latency
• Manual review volume
• System uptime

Clear metrics helped align engineering effort with business impact.

Strategy & Tradeoffs

Build vs Buy Analysis

Evaluated vendor cost vs internal investment, speed to market vs long-term scalability, accuracy gains vs operational complexity, and better customer experience with quick debugging and fixes. The decision: invest in building an internal platform for long-term cost efficiency and product control.

Key Tradeoffs Managed

• Accuracy vs Processing Latency
• Model Complexity vs Operational Stability
• Automation vs Human-in-the-Loop Review
• On-Device Model vs Cloud-Based Inference
• Short-Term Fixes vs Scalable Architecture

These decisions were made collaboratively across engineering and business teams.

Roadmap & Execution

Phase 1: Research & Foundation

• R&D and benchmarking of ML models for information extraction
• Build placeholder model to enable parallel infrastructure development
• Collaboration with mobile engineers for IMAP integrations to retrieve HTML receipts
• PII masking pipeline
• Training and inference pipeline setup
• Build internal annotation team, data annotation infrastructure and tooling
• Establish standardized experimentation framework for ML model development

Phase 2: Accuracy Optimization

• Improve extraction quality
• Refine matching algorithms
• Shadow testing against vendor output
• Introduce monitoring dashboards

Phase 3: Operational Scaling

• CI/CD automation
• Production deployment on AWS
• Load testing and horizontal scaling of inference endpoints
• Define SLAs for receipt processing pipeline
• Service telemetry (uptime, p90, p99 latency)
• Monitoring & alerting system
• Human-in-the-loop feedback loop

What I Owned

Dual role: hands-on IC on ML R&D while leading the team

• Technical design of the ML pipeline
• R&D of NLP/ML models
• Backlog prioritization and sprint planning
• Cross-team coordination
• Technical Design Review (TDR) presentations
• Annotation team management
• KPI monitoring
• Maintained project Kanban board, codebase, Bitbucket repos, and Confluence documentation
• UAT and release readiness

Cross-Functional Leadership

I led an 8-member cross-functional team of data scientists, ML engineers, backend engineers, data analyst, and QA:

• Led daily standups and cross-team collaboration and agreements
• Translated ambiguous business goals into technical requirements
• Defined sprint-ready user stories
• Negotiated scope tradeoffs
• Removed blockers during execution
• Presented progress and impact to C-suite executives
• Partnered with the data integrity team to establish human-in-the-loop weekly audits on real-world production data, creating a structured feedback loop between model performance and ground truth
• Influenced product and engineering teams to prioritize building an internal debugging and observability app where the team could inspect model predictions end-to-end and identify whether a failed offer trigger was a model extraction failure or a downstream service issue
• Designed and led the production deployment strategy with phased rollout stages: shadow testing against vendor output, canary deployment on a subset of live traffic, initial release on Android (lower volume) to allow time for reviewing and monitoring production data before ramping up to iOS (high volume), gradual rollout by retailer volume tiers, and full production cutover with rollback safeguards

The platform touched multiple product surfaces (mobile app, backend services, and analytics dashboards) requiring tight coordination under startup-mode deadlines.

Results & Impact

Accuracy: significantly improved product match accuracy, meeting senior leadership expectations based on human evaluations on production data
Cost: delivered multimillion-dollar annual savings by replacing third-party vendor
Operations: reduced manual review workload through structured human-in-the-loop feedback loop
Business: improved reward attribution reliability for 600+ brand partners, enabling higher targeting precision and improved reporting trust
Scalability: successfully processed 1M+ receipts per day at production scale
Innovation: innovated a new state-of-the-art algorithm for digital receipt extraction directly from unstructured HTML layout/DOM, bypassing traditional OCR pipelines completely (ML Technical Deep-Dive · US Patent)

What Made This a Product Win (Not Just ML)

This initiative succeeded because:

• Defined measurable business KPIs before building
• Treated it as a platform, not a one-off feature
• Balanced short-term business pressure with long-term architecture
• Adopted a fast experimentation and fail-early mindset
• Aligned stakeholders early
• Implemented monitoring & feedback loops
• A team of hardworking and talented data scientists, ML engineers, and backend engineers who made this possible

Lessons Learned

• Vendor dependency limits product innovation
• Early KPI alignment prevents misaligned engineering effort
• Monitoring and post-launch feedback are as critical as initial accuracy
• Cross-functional trust accelerates delivery more than technical brilliance alone
• Product success requires operational scalability, not just algorithmic performance

A Scalable AI-Powered eReceipt Intelligence System