Perso Interactive On-Device SDK enables you to create interactive AI conversations with photorealistic human avatars, featuring seamless AI Lip Sync powered by the on-device model. The SDK provides a flexible conversational AI pipeline (STT, LLM, TTS) that allows you to integrate your own custom models. This on-device approach ensures a private, seamless conversational AI experience with minimal latency.
- Realistic AI Human Avatars: Natural expressions with AI Lip Sync
- Conversational AI Pipeline: Speech-to-Text, Large Language Model, and Text-to-Speech
- Customization: Integrate your own STT, LLM, and TTS models into the Conversational AI Pipeline
- On-Device Processing: Private, low-latency ML inference powered by Apple Neural Engine
- iOS 18.0+ / macOS 15.0+ / visionOS 2.0+
- Swift 6.0+
- Xcode 16.0+
- Open your Xcode project
- Navigate to
File>Add Package Dependencies... - Enter the following URL of this repository :
https://github.com/perso-ai/perso-interactive-ondevice-sdk-swift - Choose the version range or specific version
- Click
Add Packageto addPersoInteractiveOnDeviceSDKto your project
Once this setup is complete, you can import PersoInteractiveOnDeviceSDK and start using the SDK in your Swift code.
import PersoInteractiveOnDeviceSDK
// 1. Initialize SDK
PersoInteractive.apiKey = "YOUR_API_KEY"
PersoInteractive.computeUnits = .ane
// 2. Load on-device models
try await PersoInteractive.load()
try await PersoInteractive.warmup()
// 3. Fetch and prepare model-style
let modelStyles = try await PersoInteractive.fetchAvailableModelStyles()
guard let modelStyle = modelStyles.first(where: { $0.availability == .available }) else {
return
}
// Download model-style resources if needed
if case .unavailable = modelStyle.availability {
let stream = PersoInteractive.loadModelStyle(with: modelStyle)
for try await progress in stream {
if case .progressing(let value) = progress {
print("Downloading: \(Int(value.fractionCompleted * 100))%")
}
}
}
// 4. Configure audio session (iOS/visionOS only)
#if os(iOS) || os(visionOS)
try PersoInteractive.setAudioSession(
category: .playAndRecord,
options: [.defaultToSpeaker, .allowBluetooth]
)
#endif
// 5. Fetch features
let sttModels = try await PersoInteractive.fetchAvailableSTTModels()
let llmModels = try await PersoInteractive.fetchAvailableLLMModels()
let ttsModels = try await PersoInteractive.fetchAvailableTTSModels()
let prompts = try await PersoInteractive.fetchAvailablePrompts()
// 6. Create a session
let session = try await PersoInteractive.createSession(
for: [
.speechToText(type: sttModels.first!),
.largeLanguageModel(llmType: llmModels.first!, promptID: prompts.first!.id),
.textToSpeech(type: ttsModels.first!)
],
modelStyle: modelStyle,
statusHandler: { status in
print("Session status: \(status)")
}
)
// 7. Display AI Human
let videoView = PersoInteractiveVideoView(session: session)
videoView.videoContentMode = .aspectFit
try videoView.start()
// 8. Start conversation
let userMessage = UserMessage(content: "Hello!")
let stream = session.completeChat(message: userMessage)
for try await message in stream {
if case .assistant(let assistantMessage, _) = message,
let chunk = assistantMessage.chunks.last {
try? videoView.push(text: chunk)
}
}Start with the Getting Started guide, and explore other articles in the documentation as needed. Check out the Sample Project to see the Perso Interactive SDK in action.
Perso Interactive SDK for Swift is commercial software. Contact our sales team.