AI-Driven Call Analytics & Summarization Pipeline

An end-to-end event-driven pipeline built on Azure Service Bus that ingests call center recordings, transcribes audio to text, and leverages generative AI to produce structured summaries, payment-intent scores, and agent-performance metrics — deployed to production with 80–85% content accuracy.

Project Overview

The business needed actionable insights from thousands of call center interactions but lacked an automated way to process and evaluate them. After assessing different call types, the business team identified collections calls as the highest-value starting point — these calls have clearly defined objectives (payment commitment, due-date negotiation) that make them ideal for structured evaluation.

The project delivers a fully automated pipeline that:

Ingests call metadata and audio recordings from the telephony provider.
Transcribes the audio into text using Azure Speech Services.
Applies generative AI to extract key metrics: general summary, customer willingness to pay, committed payment date, and agent-customer interaction quality.
Persists every step and result in CosmosDB, building a per-customer call history for downstream systems.

This pipeline was later adopted and extended by the Data Engineering team to cover additional call types and feed broader analytics across the organization.

Architecture

The system spans two cloud providers to accommodate the call center provider tool’s requirements (AWS) while keeping the core processing on the company’s Azure infrastructure.

High-Level Flow

Pipeline Stages (Service Bus Topics)

The pipeline is orchestrated through Azure Service Bus topics, where each stage processes a specific concern and forwards the result to the next:

Stage	Service Bus Topic	Responsibility
Ingestion	Topic 1 — Call Metadata	Receives call data from the AWS Lambda bridge, downloads the audio file from the call center provider’s platform, and stores metadata in CosmosDB to avoid redundant lookups
Transcription	Topic 2 — Audio-to-Text	Sends the audio to Azure Speech Services for transcription and attaches the resulting text to the call record
Analysis	Topic 3 — AI Metrics	Processes the transcript through generative AI models to extract structured metrics, then persists the final enriched record in CosmosDB

Component Breakdown

AWS Lambda — Entry point that receives webhook events from the call center provider tool and forwards them to Azure Service Bus. This cross-cloud bridge was required by the provider’s integration contract.
Azure Service Bus — Backbone of the pipeline. Topics and subscriptions decouple each processing stage, enabling independent scaling and retry logic.
Azure Functions — Stateless compute units attached to each Service Bus topic subscription, handling the processing logic for each stage.
Azure Speech Services — Converts call audio recordings into text transcripts.
Azure OpenAI — Generative AI models that analyze transcripts and produce structured output (summaries, scores, dates).
CosmosDB — Document database storing call metadata, transcripts, AI-generated metrics, and the full call history per customer.

AI-Powered Metrics Extraction

The analysis stage applies prompt-engineered generative AI models to extract the following from each call transcript:

Extracted Metrics

General Summary — A concise overview of the call’s content and outcome.
Payment Willingness — Whether the customer expressed willingness to pay or rejected the collection attempt.
Committed Payment Date — The specific date (if any) the customer agreed to make a payment.
Agent-Customer Interaction Quality — An evaluation of the dialogue to determine if the agent followed proper protocols and maintained professional communication.

These metrics are returned as structured data and stored alongside the original transcript in CosmosDB, making them queryable by the internal CRM and automated outbound calling systems.

Data Model & Customer History

A key design decision was how to handle the fact that a single customer can receive multiple collections calls — due to system retries, payment follow-ups, or unfulfilled commitments.

CosmosDB Document Structure

Each customer is represented by a single CosmosDB document that accumulates their full call history:

The top-level record contains customer identifiers and current status.
Each call is appended as an entry within the document, preserving the complete timeline of interactions.
Every call entry includes its own metadata, transcript, AI-generated summary, and metric scores.

This approach enables:

Quick lookup of a customer’s entire collections history in a single read.
Trend analysis across multiple calls (e.g., tracking shifts in payment willingness over time).
Efficient feeds to the CRM dashboard and automated dialing systems.

Results

The pipeline was deployed to production and demonstrated strong performance:

80–85% accuracy in correctly capturing and summarizing call content and intent.
Significant reduction in manual call review effort for the collections operations team.
Enabled data-driven prioritization of follow-up calls based on AI-assessed payment likelihood.
Provided the internal CRM with real-time call intelligence, giving agents immediate context on customer history before initiating new calls.

Tech Stack

Layer	Technology
Ingestion	AWS Lambda, Azure Service Bus
Processing	Azure Functions, Python
Transcription	Azure Speech Services
AI Analysis	Azure OpenAI
Storage	Azure CosmosDB
Orchestration	Azure Service Bus Topics & Subscriptions

Conclusion

This project demonstrated the value of combining cloud-native event-driven architecture with generative AI to transform raw call recordings into structured, actionable business intelligence. The pipeline’s success with collections calls validated the approach, leading the Data Engineering team to expand the system to cover additional call categories and feed organization-wide analytics initiatives.