OpenAI showcased a suite of new tools at its annual DevDay conference on September 24th, 2024. The event came amidst a period of significant change for OpenAI, marked by recent high-profile executive departures.
Despite the internal restructuring, OpenAI emphasized its commitment to developers and building a robust platform for creating AI-powered applications. They announced several new features in the open-source API to improve user experience and address developer needs.
Realtime API: Voice-Based AI Interactions
The main announcement was the public beta of the Realtime API. This new tool lets developers build apps that have low-latency, speech-to-speech AI interactions. It comes with six unique voices from OpenAI, not the same as the voices in ChatGPT’s Advanced Voice Mode. Due to copyright issues, third-party voice integration is not available.
OpenAI showed a trip planning app using the Realtime API. Users could talk to an AI assistant and ask about a trip to London and get immediate answers. The app would integrate with mapping tools and show relevant restaurant locations. The Realtime API can also do interactions with humans via phone calls, so users can ask about food orders for events. But like Google’s discontinued Duo app, the API can’t directly call restaurants or businesses. You need to integrate with a third-party calling API like Twilio to do that.
Transparency and Safety
Missing from the Realtime API is automatic identification of AI models during phone calls. OpenAI leaves it up to developers to add those disclosures.
OpenAI also added vision fine-tuning in their API. This lets developers use both images and text to fine-tune GPT-4o models, which could improve performance on tasks that require visual understanding. OpenAI is all about data security, so you can’t upload copyrighted images or content that violates safety protocols.
Also OpenAI announced a prompt caching feature so developers can store frequently used information between API calls. Anthropic already has this feature. While OpenAI estimates a 50% cost reduction through this feature, Anthropic promises a more significant 90% discount.
Finally OpenAI announced a model distillation feature so developers can use larger models like o1-preview and GPT-4o to fine-tune smaller models. This can improve performance of smaller, more cost-effective models. To make it easier to try, OpenAI released a beta tool to measure fine-tuning performance in the OpenAI API.
DevDay 2024 had some big announcements but some things missing. The GPT Store that was announced last year is still MIA. OpenAI has been piloting a revenue sharing program with popular GPT creators but no details. And no new AI models announced, like o1 or Sora.