11.8 C
New York

OpenAI Unveils Realtime API for Low-Latency AI Voice Interactions at DevDay 2024

OpenAI showcased a suite of new tools at its annual DevDay conference on September 24th, 2024. The event came amidst a period of significant change for OpenAI, marked by recent high-profile executive departures.

Despite the internal restructuring, OpenAI emphasized its commitment to developers and building a robust platform for creating AI-powered applications. They announced several new features in the open-source API to improve user experience and address developer needs.

Realtime API: Voice-Based AI Interactions

The main announcement was the public beta of the Realtime API. This new tool lets developers build apps that have low-latency, speech-to-speech AI interactions. It comes with six unique voices from OpenAI, not the same as the voices in ChatGPT’s Advanced Voice Mode. Due to copyright issues, third-party voice integration is not available.

OpenAI showed a trip planning app using the Realtime API. Users could talk to an AI assistant and ask about a trip to London and get immediate answers. The app would integrate with mapping tools and show relevant restaurant locations. The Realtime API can also do interactions with humans via phone calls, so users can ask about food orders for events. But like Google’s discontinued Duo app, the API can’t directly call restaurants or businesses. You need to integrate with a third-party calling API like Twilio to do that.

Transparency and Safety

Missing from the Realtime API is automatic identification of AI models during phone calls. OpenAI leaves it up to developers to add those disclosures.

OpenAI also added vision fine-tuning in their API. This lets developers use both images and text to fine-tune GPT-4o models, which could improve performance on tasks that require visual understanding. OpenAI is all about data security, so you can’t upload copyrighted images or content that violates safety protocols.

Also OpenAI announced a prompt caching feature so developers can store frequently used information between API calls. Anthropic already has this feature. While OpenAI estimates a 50% cost reduction through this feature, Anthropic promises a more significant 90% discount.

Finally OpenAI announced a model distillation feature so developers can use larger models like o1-preview and GPT-4o to fine-tune smaller models. This can improve performance of smaller, more cost-effective models. To make it easier to try, OpenAI released a beta tool to measure fine-tuning performance in the OpenAI API.

DevDay 2024 had some big announcements but some things missing. The GPT Store that was announced last year is still MIA. OpenAI has been piloting a revenue sharing program with popular GPT creators but no details. And no new AI models announced, like o1 or Sora.

Subscribe

Related articles

Top 7 Mobile App Development Mistakes and How to Avoid Them

Mobile app development brings many chances but also has...

Microsoft Patents Speech-to-Image Technology

Microsoft has just filed a patent for a game...

OpenAI’s Swarm Framework: AI Automation and Job Concerns

Swarm is the new experimental framework from OpenAI and...

Almost Half of All Fraud Attempts Now Use AI, New Data Reveals

As artificial intelligence (AI) advances, its use in fraud...

Author

editorialteam
editorialteam
If you wish to publish a sponsored article or like to get featured in our magazine please reach us at contact@alltechmagazine.com