Update app to latest OpenAI API#177
Merged
jamesrochabrun merged 25 commits intoNov 12, 2025
Merged
Conversation
This commit adds full support for OpenAI's Realtime API, enabling bidirectional voice conversations with GPT-4o models. The implementation is based on the proven AIProxySwift library and adapted for SwiftOpenAI. Key Features: - Real-time bidirectional audio streaming over WebSockets - Voice Activity Detection (VAD) for automatic turn-taking - Audio transcription of user and AI speech - Function calling support - Platform-aware audio processing (iOS, macOS, watchOS) - Automatic echo cancellation and voice processing New Components: Audio Infrastructure: - AudioController: Main controller for audio recording and playback - AudioPCMPlayer: Plays PCM16 audio from OpenAI - MicrophonePCMSampleVendor: Protocol and implementations for mic input - AVAudioEngine-based implementation (headphones) - AudioToolbox-based implementation (speakers) - AudioUtils: Platform-specific audio utilities Realtime API: - OpenAIRealtimeSession: Manages WebSocket connection and message flow - OpenAIRealtimeSessionConfiguration: Session configuration options - OpenAIRealtimeMessage: All message types from server - RealtimeActor: Global actor for thread-safe audio operations Parameters: - OpenAIRealtimeSessionUpdate - OpenAIRealtimeInputAudioBufferAppend - OpenAIRealtimeResponseCreate - OpenAIRealtimeConversationItemCreate Response Models: - OpenAIRealtimeMessage (enum with all event types) - OpenAIRealtimeInputAudioBufferSpeechStarted - OpenAIRealtimeResponseFunctionCallArgumentsDone Shared Utilities: - OpenAIJSONValue: Type-safe JSON handling for tool schemas - OpenAIError: Extended error types - AudioController: Public API for audio management Service Integration: - Added realtimeSession() method to OpenAIService protocol - Implemented WebSocket connection setup in DefaultOpenAIService Examples: - Complete RealtimeExample with usage documentation - README with configuration options and troubleshooting Credit: Implementation based on AIProxySwift by Lou Zell https://github.com/lzell/AIProxySwift
Mark the initializer as nonisolated to allow creation from outside the RealtimeActor context. Also mark receiveMessage() as nonisolated since it only sets up callbacks. This fixes the compilation error when creating a realtime session from DefaultOpenAIService.realtimeSession().
Added fatalError placeholders to: - AIProxyService - DefaultOpenAIAzureService - LocalModelService These services now conform to the OpenAIService protocol. The Realtime API is only fully implemented in DefaultOpenAIService.
Changed AudioUtils enum and its methods to public so users can: - Access base64EncodeAudioPCMBuffer() for encoding audio buffers - Check headphonesConnected status if needed This is required for the RealtimeExample code.
Apply automatic formatting fixes to resolve CI lint failures. Fixes formatting in 26 files including audio controller, realtime session, and service implementations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Apply formatting fix to debug logging statement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove consecutive blank lines and trailing space. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Wrap all audio and realtime API code with #if canImport(AVFoundation) to support Linux builds where AVFoundation is not available. Changes: - Wrap audio files (AudioPCMPlayer, AudioUtils, AudioController, etc.) - Wrap realtime session files - Wrap realtimeSession methods in all service implementations - Maintain full functionality on Apple platforms Fixes Linux CI build errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The nonisolated modifier cannot be applied to enum declarations. This fixes build errors on Swift 6.0.1 (Linux CI). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…sage Removed the custom initializer that was assigning values to the deprecated status and statusDetails fields. The struct now relies on the synthesized Decodable initializer, which still decodes these deprecated fields from JSON for backward compatibility, but we no longer actively assign to them. This resolves CI warnings/errors about using deprecated fields while maintaining full backward compatibility with API responses. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The nonisolated modifier cannot be applied to struct or enum declarations. Removed it from: - OpenAIRealtimeInputAudioBufferSpeechStarted - OpenAIRealtimeMessage - OpenAIRealtimeResponseFunctionCallArgumentsDone These types are already marked as Sendable which is the correct way to make them safe for concurrent use. Fixes Swift 6.0.1 compilation errors on Linux CI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Moved AudioToolbox cleanup operations (AudioOutputUnitStop, AudioUnitUninitialize, AudioComponentInstanceDispose) to a background queue to prevent priority inversion where a User-initiated QoS thread waits on a Default QoS thread. The issue occurred because: - The stop() method runs on @RealtimeActor - AudioComponentInstanceDispose is a synchronous C API that may block - The audio render callback runs on a high-priority real-time thread - This created a priority inversion during cleanup Solution: - Capture the AudioUnit reference immediately - Clear the audioUnit property on RealtimeActor (no blocking) - Dispatch the actual AudioToolbox cleanup to .utility QoS queue - AudioToolbox cleanup APIs are thread-safe, making this safe This improves responsiveness when stopping audio capture and eliminates the priority inversion warning. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…and enums The nonisolated modifier cannot be applied to struct or enum declarations. Removed it from all realtime parameter files: - OpenAIRealtimeInputAudioBufferAppend - OpenAIRealtimeConversationItemCreate - OpenAIRealtimeResponseCreate (and nested Response, Tool) - OpenAIRealtimeSessionConfiguration (and nested types: ToolChoice, InputAudioTranscription, MaxResponseOutputTokens, Tool, TurnDetection, AudioFormat, Modality, DetectionType, Eagerness) - OpenAIRealtimeSessionUpdate These types are already marked as Sendable/Encodable which provides the necessary thread-safety guarantees. Fixes Swift 6.0.1 compilation errors on Linux CI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The nonisolated modifier cannot be applied to enum declarations. Removed it from: - AudioPCMPlayerError - MicrophonePCMSampleVendorError These error enums are already marked as Sendable which provides the necessary thread-safety guarantees. Fixes Swift 6.0.1 compilation errors on Linux CI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Added detailed documentation for the new Realtime API functionality: - Updated Table of Contents with Realtime entry under Audio section - Added comprehensive "Audio Realtime" section (~280 lines) covering: - Introduction and platform requirements - Session configuration with all available parameters - AudioController usage for recording and playback - Message types for bidirectional communication - Complete basic usage example with 8-step workflow - Function calling example with tools - Advanced features overview - Reference to RealtimeExample The documentation follows the established README style with: - Inline commented Swift code blocks - Practical, copy-paste ready examples - Platform-specific notes - Links to OpenAI's official documentation Documentation covers all key features: - WebSocket-based bidirectional audio streaming - Voice activity detection (server VAD and semantic VAD) - Real-time transcription with Whisper - Function calling support - Session configuration updates - Audio playback without feedback 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This duplicate README was accidentally included and was inflating the PR diff by 4,331 lines. The main README.md at the repository root is the authoritative documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The entire SwiftOpenAIExample project was accidentally duplicated at the wrong path: Examples/SwiftOpenAIExample/Examples/... Removed 72 duplicate files that already exist at the correct location: Examples/SwiftOpenAIExample/SwiftOpenAIExample/... This cleans up the PR diff significantly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The Examples/SwiftOpenAIExample project should use the package via Swift Package Manager dependency, not contain its own copy of the entire SwiftOpenAI source code. Removed 112 duplicate source files that created unnecessary bloat in the PR diff. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove IDEWorkspaceChecks.plist files (Xcode internal state) - Add IDEWorkspaceChecks.plist to .gitignore - Remove duplicate .github/workflows/ci.yml from Examples folder These files don't affect package functionality and create unnecessary noise in commits. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Added mention of the new Realtime API feature in the main package description to better showcase this major capability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Changed "Session Configuration" to "Parameters" heading - Changed "Message Types" to "Response" heading - Added "Supporting Types" section for AudioController and AudioUtils - Changed "Basic Usage" to "Usage" heading - Completed message type enum (added missing transcription cases) Now follows the same Parameters → Response → Usage pattern as other API sections (Audio Transcriptions, Translations, Speech). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove Examples/SwiftOpenAIExample/Tests/ directory (6 duplicate test files) - Remove testTarget from Package.swift These test files were accidentally added in commit 3e86ea1 along with other duplicate content. They are exact copies of the main Tests/ directory and not needed in the Examples folder. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Removed from Examples/SwiftOpenAIExample/: - CONTRIBUTING.md (duplicate of root) - LICENSE (duplicate of root) - rules.swiftformat (duplicate of root) - .gitignore (duplicate of root) - Package.resolved (SPM lock file, should not be tracked) Also added Package.resolved to root .gitignore to prevent tracking lock files throughout the project. These files were accidentally added in commit 3e86ea1 along with other duplicate content. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The Examples/SwiftOpenAIExample/Package.swift was accidentally added and tried to redefine the SwiftOpenAI library. This is not needed because: - The root Package.swift already defines the SwiftOpenAI library - The Examples folder uses SwiftOpenAIExample.xcodeproj (Xcode project) - This would create conflicts with the root package definition 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.