The Scale Problem
At Asurion, thousands of sales agents make live calls every day. Every call must follow regulatory compliance guidelines — required disclosures, no high-pressure tactics, proper consent before enrollment. This isn't optional; it's legally mandated.
Manual review can't keep up. Keyword-based automated systems generate too many false positives (flagging "cancel" in "I'll cancel my lunch plans to stay on the call"). We needed a system that understands context.
Three-Stage Pipeline
Stage 1: Real-Time Transcription
Call audio streams through AWS Transcribe, producing near-real-time transcripts. The key challenge was speaker diarization — knowing who said what. A disclosure is only valid if the agent made it. If the customer mentions the disclosure terms, that doesn't count.
Stage 2: LLM-Powered Analysis
Each completed transcript gets processed by Claude 3.7 Sonnet via AWS Bedrock Flows. The system evaluates four compliance dimensions:
- Disclosure verification — did the agent make all legally required disclosures?
- Pressure detection — did the agent use high-pressure sales tactics?
- Consent validation — was proper consent obtained before enrollment?
- Accuracy check — were product details described correctly?
We use structured JSON output schemas so every analysis produces comparable, scoreable results. This lets us track compliance trends over time and spot systemic issues across teams or scripts.
Stage 3: RAG for Living Policy
Compliance rules change frequently — new regulations, updated scripts, revised disclosure requirements. Instead of re-prompting or retraining on every change, a RAG pipeline pulls from our living compliance handbook. The AI's definition of "compliant" evolves with policy automatically.
Why These Specific Technical Choices
Claude over GPT for compliance. In compliance monitoring, a false negative (missing a real violation) is far more costly than a false positive. Claude's thorough, cautious analytical style aligned with this asymmetry better than alternatives that optimize for conciseness.
Bedrock Flows over direct API calls. At 150K+ calls per month, we need built-in orchestration, retry logic, and observability. Bedrock Flows provides all of this without us building and maintaining our own queue infrastructure.
Structured output for consistency. Free-text compliance analysis is impossible to score programmatically. JSON schemas ensure every call gets evaluated on the same dimensions with comparable scores.
Results
- 150K+ monthly calls analyzed in near-real-time
- 45% improvement in call review efficiency
- $1.6M annual savings from automated review
- 15 supervised and generative AI models deployed across the pipeline
Three Hard-Won Lessons
Start with the ambiguous 2%. The 98% of calls that are clearly compliant or clearly non-compliant are easy. The value is in the 2% of ambiguous calls where context determines whether a statement is a violation. We built our evaluation set from these edge cases and iterated the prompts against them.
Human-in-the-loop is non-negotiable. The AI flags; humans decide. This builds trust with the compliance team and generates continuous training signal. Fully automated enforcement would be faster but would never get organizational buy-in — nor should it.
Latency budgets are constraints, not targets. For real-time monitoring, every second matters. We optimized the pipeline to deliver verdicts within 30 seconds of call completion. This required careful attention to transcript buffering, model inference time, and result delivery — not just model accuracy.