San Francisco—Deepgram, a voice artificial intelligence platform, has secured $130 million in Series C funding at a $1. 3 billion valuation, marking a major milestone in the booming voice AI sector.
The San Francisco-based company, which provides real-time speech-to-text and text-to-speech APIs for enterprises and developers, announced the funding on January 13, 2026, led by AVP, a global investment platform focused on high-growth technology companies. This achievement reflects surging demand for AI-powered voice solutions across industries from customer service to healthcare.
Key Facts
• Deepgram raised $130 million in Series C funding at $1.3 billion valuation, bringing total funding to over $215 million, with the round led by AVP on January 13, 2026.
• More than 1,300 organizations currently use Deepgram APIs, including major companies like Twilio, AWS Connect, and Cloudflare, demonstrating widespread enterprise adoption.
• The company acquired OfOne, a Y Combinator-backed startup specializing in restaurant and drive-through voice ordering technology, as part of its Series C announcement.
San Francisco—Voice AI has rapidly transformed from a futuristic experiment into essential business infrastructure. Enterprises now deploy voice technology for customer support, clinical documentation, sales automation, and compliance monitoring.
Deepgram powers this shift by providing developers and companies with production-ready APIs that handle real-time conversations across multiple languages.
The funding round attracted both established investors and new strategic backers. Existing investors Alkeon, In-Q-Tel, Madrona, Tiger, Wing, Y Combinator, and BlackRock participated alongside new investors Alumni Ventures, Princeville Capital, and major tech companies like Twilio, ServiceNow Ventures, SAP, and Citi Ventures.
This diverse investor base signals strong confidence in voice AI's future across different sectors.
Deepgram's platform stands apart because it handles the technical challenges that have historically hindered voice AI adoption. The company's models support code-switching, meaning they seamlessly transition between languages mid-conversation.
Their APIs deliver sub-second latency, enabling truly natural, human-like interactions that don't feel delayed or robotic. These capabilities make voice AI viable for sensitive, high-value applications.
With this capital infusion, Deepgram plans aggressive global expansion and product development. The company will expand its Nova-3 speech-to-text model to support more than 100 languages and dialects.
They're also investing heavily in infrastructure, building edge locations and computing capacity across multiple regions to serve customers worldwide with consistently fast performance.
The acquisition of OfOne represents Deepgram's strategic push into specific industry verticals. Restaurants and quick-service establishments face enormous operational challenges managing orders, handling interruptions, and managing customer queries simultaneously.
OfOne's specialized voice ordering technology, now integrated into Deepgram's platform, positions the company to become the default voice AI infrastructure for the restaurant and drive-through sector.
CEO Scott Stephenson emphasized that Deepgram's success stems from solving real enterprise problems at scale.
Stephenson also noted that the company was cashflow positive last year, meaning the funding accelerates growth rather than extending survival.
The broader voice AI market has attracted massive investment recently. Competitors like ElevenLabs raised $180 million in Series C, while Sesame secured $250 million in Series B funding.
This competition validates Deepgram's market position and the sector's explosive potential. Industry analysts predict voice will become the primary human-computer interface, similar to how touchscreens replaced keyboards.
Do You Know?
Deepgram has secured multiple U.S. patents for voice AI innovations, including patents for hardware-efficient automatic speech recognition and deep learning techniques that enable faster audio search and more accurate classification at scale. These patents protect the company's core technology and give it competitive advantages that competitors cannot easily replicate.
Key Terms
• Speech-to-Text (STT): Technology that converts spoken words into written text in real-time, commonly used in transcription services, meeting notes, and voice commands.
• Text-to-Speech (TTS): AI technology that converts written text into natural-sounding human voice, used in voice assistants, customer service automation, and accessibility features.
• Code-Switching: The ability to seamlessly transition between multiple languages within a single conversation, essential for multilingual customer support and international business operations.
• Latency: The time delay between when someone speaks and when the AI processes and responds, measured in milliseconds. Lower latency means more natural, human-like conversations.
• Voice AI API: A software interface allowing developers to build voice-powered applications without developing underlying speech technology from scratch, making deployment faster and more accessible.