How SpeechTranslate Works
Four steps between a caller speaking Mandarin and an agent hearing English. Under 300 milliseconds. Fully bidirectional. No app to install. Now with Speech-to-Speech AI.
Customer Speaks in Their Language
The caller speaks naturally in their native language. SpeechTranslate captures the audio stream in real time using WebRTC and processes it through advanced speech recognition powered by AWS Transcribe, Azure Speech, or Google Speech-to-Text — automatically selecting the best provider for each language.
Language Auto-Detected
SpeechTranslate's multi-provider detection system identifies the caller's language within seconds — voting across AWS Transcribe, Google Cloud, and Azure Speech for high-confidence results. If the caller switches languages mid-call, the system detects and adapts automatically.
AI Translates in Real Time
SpeechTranslate offers two translation pathways: Speech-to-Speech mode uses Amazon Nova Sonic V2 or Google Gemini Live to translate audio directly — no text intermediary, lower latency. Pipeline mode uses neural STT, machine translation, and TTS with multi-provider failover. Both modes support custom glossaries for domain-specific terminology (medical, legal, financial).
Agent Hears Translated Speech
The translated text is converted to natural-sounding speech using text-to-speech (TTS) synthesis and played back to the agent in English — all within 200-300 milliseconds. The process works bidirectionally: the agent's English responses are simultaneously translated back into the caller's language.
AI Agent Assist — Your Agent's Real-Time Copilot
While SpeechTranslate handles the language barrier, Agent Assist gives your agents an AI copilot that listens to the live transcript and provides instant answers, suggested actions, and automated workflows — powered by Amazon Bedrock Agents and AgentCore with retrieval-augmented generation (RAG). Use Prompt Studio to define when the AI chatbot triggers based on customer utterances and what intents and entities to extract — no code changes required.
Live Transcript Classification
Automatically classifies caller intent from the live transcript — routing questions, complaints, and requests to the right action without the agent manually tagging anything.
Knowledge Base Retrieval (RAG)
Pulls answers from your company's documents, FAQs, and policies in real time. Supports S3 vector stores, DynamoDB, and Aurora Serverless as knowledge sources.
Action Groups & Tool Use
The agent can trigger real actions — look up an order, create a ticket in Zendesk, update a record in Salesforce — directly from the conversation context.
Multi-Turn Session Memory
Maintains full conversation context across turns. The AI assistant remembers what was discussed earlier in the call, not just the last message.
Prompt Studio — Intent Triggers & Entity Extraction
Define when the AI chatbot activates based on customer utterances — configure classification prompts, intents to detect, and entities to extract from the conversation. When a customer says something matching your triggers, Prompt Studio sends the right context to Bedrock Agents or AgentCore for an intelligent response. Supports draft/publish workflow.
Bring Your Own Knowledge
Agent Assist connects to your existing data — wherever it lives. Upload documents to S3, connect a DynamoDB table, or query Aurora Serverless directly. The Bedrock knowledge base indexes your content into a vector store and retrieves the most relevant answers in real time during every call. Choose your runtime: Bedrock Agents for fully managed orchestration, or AgentCore for advanced streaming traces and custom tool integrations via MCP gateway.
S3 Vector Store
PDFs, Word docs, CSVs, HTML, Markdown — drop files into S3 and they're automatically chunked, embedded, and searchable. Supports all major document formats.
DynamoDB
Connect live operational data — product catalogs, customer records, policy tables. Agent Assist queries your DynamoDB tables in real time during calls.
Aurora Serverless
For structured data at scale — connect Aurora Serverless PostgreSQL or MySQL as a knowledge source with full SQL query support via Bedrock action groups.
Integrates With Your Stack
Agent Assist uses Bedrock action groups to connect to any external system via REST APIs. Out-of-the-box support for major CRM and helpdesk platforms — plus any custom API.
Salesforce
CRM lookup, case creation, contact updates
Zendesk
Ticket creation, knowledge base search
Zoho
CRM, Desk, and custom module integration
ServiceNow
Incident management, CMDB queries
HubSpot
Contact records, deal pipeline, tickets
Custom APIs
Any REST API via Bedrock action groups
ConnectIQ v2 — See What Your Calls Are Really Telling You
Most contact centers review 2-5% of calls manually. ConnectIQ v2 scores 100% of them automatically, monitors translation quality in real time, tracks provider health, and detects cross-lingual patterns across your entire call volume — in all 66 supported languages. It extends Amazon Connect Contact Lens with automated QA, Translation Quality Index (TQI), pipeline health monitoring, cost quantification, and AI-powered root cause analysis.
100% Automated QA Scoring
of calls scored
Every single call gets scored — not just the 2-5% that traditional QA teams can review. ConnectIQ evaluates agent performance, compliance, and customer sentiment across all calls automatically.
Cross-Call Pattern Detection
pattern detectors
ConnectIQ doesn't just analyze individual calls — it detects patterns across your entire call volume. Repeated complaints about a product, emerging billing issues, or training gaps that affect multiple agents are surfaced before they escalate.
Cost Quantification
cost per issue
Every detected issue is tagged with an estimated cost impact. When ConnectIQ flags that 12% of calls about "returns" result in repeat contacts, it also tells you that's costing $47K/month — making prioritization decisions instant.
AI Root Cause Analysis
root cause
When a pattern is detected, ConnectIQ doesn't just show you the data — it generates a root cause analysis explaining why it's happening, which agents and teams are affected, and what specific actions to take.
Translation Quality Index (TQI)
per-call score
Every call receives a quality score computed from latency, confidence, language detection accuracy, and translation fidelity — measured via async back-translation scoring. Track quality trends per language pair, per provider, hourly, and daily.
Pipeline Health Monitoring
health checks
Real-time health metrics for every translation provider, aggregated every 5 minutes. When a provider degrades, ConnectIQ detects it automatically and alerts your team — before customers notice.
Cross-Lingual Pattern Detection
languages correlated
Correlates complaint patterns across languages — a Spanish complaint call and a Mandarin complaint call about the same product issue are linked together. Something monolingual QA teams simply cannot do.
Automated Action Triggers
auto-actions
Detected patterns automatically trigger configurable actions: webhook notifications, email alerts, Amazon Connect tasks, or SQS messages — turning intelligence into immediate operational response.
Why ConnectIQ Is Different
Not Just Analytics — Intelligence
Traditional analytics dashboards show you charts. ConnectIQ generates actionable intelligence: root cause explanations, cost impact estimates, and specific recommendations for each detected issue.
Cross-Call, Not Per-Call
Individual call analysis misses systemic problems. ConnectIQ correlates patterns across thousands of calls to surface issues that no single call review would catch — like a product defect causing a 40% spike in returns-related calls across three regions.
Translation Quality + Cross-Lingual Intelligence
ConnectIQ v2 monitors translation quality per call via TQI scoring and correlates patterns across all 66 languages — a Spanish complaint and a Mandarin complaint about the same issue get linked together, with per-provider health tracking ensuring consistent quality.
Under the Hood
Built on Enterprise-Grade AWS Infrastructure
SpeechTranslate combines multiple speech and translation providers with a custom real-time audio pipeline to deliver the fastest, most reliable speech-to-speech translation available.
Dual-Mode: Speech-to-Speech + Pipeline
Choose Speech-to-Speech via Nova Sonic V2 or Gemini Live for direct audio translation, or the multi-provider STT+MT+TTS pipeline with automatic failover across AWS, Azure, and Google.
Sub-300ms Latency
AudioWorklet-based gapless playback pipeline with WebRTC transport delivers near-instant translation with no perceptible delay.
Custom Glossary Support
Define domain-specific terminology for healthcare, legal, financial, or any industry to ensure critical terms are translated correctly.
Bidirectional Translation
Both sides of the conversation are translated simultaneously. The caller and agent each hear the other in their own language.
Amazon Connect Integration
Native PSTN integration via Amazon Connect — works with real phone calls, not just browser-to-browser. Agents use the familiar CCP softphone.
Automatic Language Detection
Multi-provider voting system across AWS Transcribe, Google Cloud, and Azure Speech identifies the caller's language from the first seconds of speech — with configurable confidence thresholds and mid-call language switching when the caller changes languages.
Mid-Call Language Switching
Callers aren't limited to one language per call. SpeechTranslate detects language changes in real time and switches translation seamlessly — no agent action required.
Ready to Translate Your First Call?
See how SpeechTranslate, Agent Assist, and ConnectIQ work together for your specific use case.
