Skip to content

Update app to latest OpenAI API#177

Merged
jamesrochabrun merged 25 commits into
mainfrom
claude/update-openai-api-011CUydfJyQkbmowkGChjeoK
Nov 12, 2025
Merged

Update app to latest OpenAI API#177
jamesrochabrun merged 25 commits into
mainfrom
claude/update-openai-api-011CUydfJyQkbmowkGChjeoK

Conversation

@jamesrochabrun

Copy link
Copy Markdown
Owner

No description provided.

claude and others added 25 commits November 10, 2025 05:41
This commit adds full support for OpenAI's Realtime API, enabling
bidirectional voice conversations with GPT-4o models. The implementation
is based on the proven AIProxySwift library and adapted for SwiftOpenAI.

Key Features:
- Real-time bidirectional audio streaming over WebSockets
- Voice Activity Detection (VAD) for automatic turn-taking
- Audio transcription of user and AI speech
- Function calling support
- Platform-aware audio processing (iOS, macOS, watchOS)
- Automatic echo cancellation and voice processing

New Components:

Audio Infrastructure:
- AudioController: Main controller for audio recording and playback
- AudioPCMPlayer: Plays PCM16 audio from OpenAI
- MicrophonePCMSampleVendor: Protocol and implementations for mic input
  - AVAudioEngine-based implementation (headphones)
  - AudioToolbox-based implementation (speakers)
- AudioUtils: Platform-specific audio utilities

Realtime API:
- OpenAIRealtimeSession: Manages WebSocket connection and message flow
- OpenAIRealtimeSessionConfiguration: Session configuration options
- OpenAIRealtimeMessage: All message types from server
- RealtimeActor: Global actor for thread-safe audio operations

Parameters:
- OpenAIRealtimeSessionUpdate
- OpenAIRealtimeInputAudioBufferAppend
- OpenAIRealtimeResponseCreate
- OpenAIRealtimeConversationItemCreate

Response Models:
- OpenAIRealtimeMessage (enum with all event types)
- OpenAIRealtimeInputAudioBufferSpeechStarted
- OpenAIRealtimeResponseFunctionCallArgumentsDone

Shared Utilities:
- OpenAIJSONValue: Type-safe JSON handling for tool schemas
- OpenAIError: Extended error types
- AudioController: Public API for audio management

Service Integration:
- Added realtimeSession() method to OpenAIService protocol
- Implemented WebSocket connection setup in DefaultOpenAIService

Examples:
- Complete RealtimeExample with usage documentation
- README with configuration options and troubleshooting

Credit:
Implementation based on AIProxySwift by Lou Zell
https://github.com/lzell/AIProxySwift
Mark the initializer as nonisolated to allow creation from
outside the RealtimeActor context. Also mark receiveMessage()
as nonisolated since it only sets up callbacks.

This fixes the compilation error when creating a realtime session
from DefaultOpenAIService.realtimeSession().
Added fatalError placeholders to:
- AIProxyService
- DefaultOpenAIAzureService
- LocalModelService

These services now conform to the OpenAIService protocol.
The Realtime API is only fully implemented in DefaultOpenAIService.
Changed AudioUtils enum and its methods to public so users can:
- Access base64EncodeAudioPCMBuffer() for encoding audio buffers
- Check headphonesConnected status if needed

This is required for the RealtimeExample code.
Apply automatic formatting fixes to resolve CI lint failures.
Fixes formatting in 26 files including audio controller, realtime
session, and service implementations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Apply formatting fix to debug logging statement.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Remove consecutive blank lines and trailing space.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Wrap all audio and realtime API code with #if canImport(AVFoundation)
to support Linux builds where AVFoundation is not available.

Changes:
- Wrap audio files (AudioPCMPlayer, AudioUtils, AudioController, etc.)
- Wrap realtime session files
- Wrap realtimeSession methods in all service implementations
- Maintain full functionality on Apple platforms

Fixes Linux CI build errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The nonisolated modifier cannot be applied to enum declarations.
This fixes build errors on Swift 6.0.1 (Linux CI).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…sage

Removed the custom initializer that was assigning values to the deprecated
status and statusDetails fields. The struct now relies on the synthesized
Decodable initializer, which still decodes these deprecated fields from
JSON for backward compatibility, but we no longer actively assign to them.

This resolves CI warnings/errors about using deprecated fields while
maintaining full backward compatibility with API responses.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The nonisolated modifier cannot be applied to struct or enum declarations.
Removed it from:
- OpenAIRealtimeInputAudioBufferSpeechStarted
- OpenAIRealtimeMessage
- OpenAIRealtimeResponseFunctionCallArgumentsDone

These types are already marked as Sendable which is the correct way to
make them safe for concurrent use.

Fixes Swift 6.0.1 compilation errors on Linux CI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Moved AudioToolbox cleanup operations (AudioOutputUnitStop,
AudioUnitUninitialize, AudioComponentInstanceDispose) to a background
queue to prevent priority inversion where a User-initiated QoS thread
waits on a Default QoS thread.

The issue occurred because:
- The stop() method runs on @RealtimeActor
- AudioComponentInstanceDispose is a synchronous C API that may block
- The audio render callback runs on a high-priority real-time thread
- This created a priority inversion during cleanup

Solution:
- Capture the AudioUnit reference immediately
- Clear the audioUnit property on RealtimeActor (no blocking)
- Dispatch the actual AudioToolbox cleanup to .utility QoS queue
- AudioToolbox cleanup APIs are thread-safe, making this safe

This improves responsiveness when stopping audio capture and eliminates
the priority inversion warning.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…and enums

The nonisolated modifier cannot be applied to struct or enum declarations.
Removed it from all realtime parameter files:
- OpenAIRealtimeInputAudioBufferAppend
- OpenAIRealtimeConversationItemCreate
- OpenAIRealtimeResponseCreate (and nested Response, Tool)
- OpenAIRealtimeSessionConfiguration (and nested types:
  ToolChoice, InputAudioTranscription, MaxResponseOutputTokens,
  Tool, TurnDetection, AudioFormat, Modality, DetectionType, Eagerness)
- OpenAIRealtimeSessionUpdate

These types are already marked as Sendable/Encodable which provides
the necessary thread-safety guarantees.

Fixes Swift 6.0.1 compilation errors on Linux CI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The nonisolated modifier cannot be applied to enum declarations.
Removed it from:
- AudioPCMPlayerError
- MicrophonePCMSampleVendorError

These error enums are already marked as Sendable which provides
the necessary thread-safety guarantees.

Fixes Swift 6.0.1 compilation errors on Linux CI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added detailed documentation for the new Realtime API functionality:

- Updated Table of Contents with Realtime entry under Audio section
- Added comprehensive "Audio Realtime" section (~280 lines) covering:
  - Introduction and platform requirements
  - Session configuration with all available parameters
  - AudioController usage for recording and playback
  - Message types for bidirectional communication
  - Complete basic usage example with 8-step workflow
  - Function calling example with tools
  - Advanced features overview
  - Reference to RealtimeExample

The documentation follows the established README style with:
- Inline commented Swift code blocks
- Practical, copy-paste ready examples
- Platform-specific notes
- Links to OpenAI's official documentation

Documentation covers all key features:
- WebSocket-based bidirectional audio streaming
- Voice activity detection (server VAD and semantic VAD)
- Real-time transcription with Whisper
- Function calling support
- Session configuration updates
- Audio playback without feedback

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This duplicate README was accidentally included and was inflating
the PR diff by 4,331 lines. The main README.md at the repository
root is the authoritative documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The entire SwiftOpenAIExample project was accidentally duplicated
at the wrong path: Examples/SwiftOpenAIExample/Examples/...

Removed 72 duplicate files that already exist at the correct location:
Examples/SwiftOpenAIExample/SwiftOpenAIExample/...

This cleans up the PR diff significantly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The Examples/SwiftOpenAIExample project should use the package via
Swift Package Manager dependency, not contain its own copy of the
entire SwiftOpenAI source code.

Removed 112 duplicate source files that created unnecessary bloat
in the PR diff.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Remove IDEWorkspaceChecks.plist files (Xcode internal state)
- Add IDEWorkspaceChecks.plist to .gitignore
- Remove duplicate .github/workflows/ci.yml from Examples folder

These files don't affect package functionality and create unnecessary noise in commits.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added mention of the new Realtime API feature in the main package description to better showcase this major capability.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Changed "Session Configuration" to "Parameters" heading
- Changed "Message Types" to "Response" heading
- Added "Supporting Types" section for AudioController and AudioUtils
- Changed "Basic Usage" to "Usage" heading
- Completed message type enum (added missing transcription cases)

Now follows the same Parameters → Response → Usage pattern as other API sections (Audio Transcriptions, Translations, Speech).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Remove Examples/SwiftOpenAIExample/Tests/ directory (6 duplicate test files)
- Remove testTarget from Package.swift

These test files were accidentally added in commit 3e86ea1 along with other duplicate content. They are exact copies of the main Tests/ directory and not needed in the Examples folder.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Removed from Examples/SwiftOpenAIExample/:
- CONTRIBUTING.md (duplicate of root)
- LICENSE (duplicate of root)
- rules.swiftformat (duplicate of root)
- .gitignore (duplicate of root)
- Package.resolved (SPM lock file, should not be tracked)

Also added Package.resolved to root .gitignore to prevent tracking lock files throughout the project.

These files were accidentally added in commit 3e86ea1 along with other duplicate content.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The Examples/SwiftOpenAIExample/Package.swift was accidentally added and tried to redefine the SwiftOpenAI library. This is not needed because:
- The root Package.swift already defines the SwiftOpenAI library
- The Examples folder uses SwiftOpenAIExample.xcodeproj (Xcode project)
- This would create conflicts with the root package definition

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@jamesrochabrun jamesrochabrun merged commit eacb892 into main Nov 12, 2025
3 checks passed
@jamesrochabrun jamesrochabrun deleted the claude/update-openai-api-011CUydfJyQkbmowkGChjeoK branch November 12, 2025 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants