Yagnesh L Pazhaniyappan/Enterprise Conversational AI Platform: Improved Response Latency from 20+ sec to <5 sec & Reduced Drop-offs by 25%

IDFC First BankAssociate Product Manager - Enterprise Conversational AI PlatformJuly 2025 – PresentNot captured in resume.

Owned platform development for IDFC First Bank's enterprise conversational AI, improving response latency from 20+ seconds to under 5 seconds and reducing drop-offs by 25% through intelligent context management and guardrail systems.

Key outcomes

Improved response latency from 20+ seconds to <5 seconds through prompt and context engineering

Reduced user drop-offs by 25% through HITL fallback and escalation strategies

Improved language detection speed from 3+ seconds to 100ms with multi-model ensemble system

Delivered 4/4 phase 1 modules for go-live on schedule

Improved system false negative rate to <1% through automated red-teaming

Automated test case creation for 40+ products across 7 systems

The problem

IDFC First Bank needed to build an enterprise-grade conversational AI platform from scratch to deliver personalized customer experiences at scale. The platform had to handle complex challenges: slow response times (20+ seconds), fragmented customer data across 8 different propensity models, probabilistic drift in LLM outputs, and the need for robust safety guardrails in a highly regulated banking environment. Without a cohesive platform strategy, we risked launching a system that would frustrate users with slow, inconsistent, and potentially non-compliant responses.

In banking, conversational AI platforms are becoming critical touchpoints for customer engagement and service delivery. A slow, unreliable, or non-compliant chatbot could damage customer trust and expose the bank to regulatory risk. This platform needed to deliver personalized, fast, and accurate responses while maintaining strict compliance standards, directly impacting customer satisfaction, operational efficiency, and the bank's ability to cross-sell products.

Solution

I owned the complete platform development charter and organized platform steering committee discussions to align stakeholders. I developed a comprehensive 11-layered context engine that enabled true multi-turn conversations and integrated 8 fragmented customer propensity models into an evolving customer DNA (cDNA) system that powered 6 layers of personalization. To address latency, I balanced cost, latency, and response quality through prompt and context engineering. I designed sub-second ensemble systems for language identification (improving speed from 3s+ to 100ms) and intent classification with hierarchical mapping to reduce disambiguation rates. For safety and compliance, I incorporated an agentic guardrail system with closed feedback loops and automated red-teaming. I also established a comprehensive LLMOps strategy with technical and business observability requirements, multi-turn automated regression testing, and 8 self-serve dashboards to track impact against baseline metrics from the incumbent NLP bot.

Key decisions & trade-offs

Key trade-offs included balancing response quality against latency and cost through prompt engineering rather than always using the most powerful (and expensive) models. I chose an ensemble approach for language detection and intent classification to reduce probabilistic drift, accepting additional system complexity for reliability. For personalization, I had to integrate 8 fragmented propensity models into a unified cDNA system rather than building from scratch, prioritizing speed to market. I implemented HITL (human-in-the-loop) fallback strategies for edge cases rather than trying to automate everything, recognizing that some scenarios require human judgment in a banking context.

Results

MetricBeforeAfterDeltaTime

Response latency20+ seconds<5 seconds75%+ improvementPhase 1 development

User drop-off rate—-25%-25%Post-HITL implementation

Language detection speed3+ seconds100ms97% improvementPost-ensemble deployment

System false negative rate—<1%Reduced to <1%Post-red-teaming implementation

Phase 1 module delivery—4/4 delivered—Go-live

Challenges & learnings

Challenges

This thought me i should work hard

What I'd do differently

This thought me i should work hard

Skills demonstrated

GenAILLMAgentic SystemsPrompt EngineeringLLMOpsProduct StrategyPlatform ThinkingContext EngineeringMulti-model Ensemble SystemsRed-teamingObservabilityRegression TestingStakeholder Management

Back to Yagnesh L Pazhaniyappan's folio

made with pmfolio