Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Package.resolved

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Package.swift
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ let package = Package(
],
dependencies: [
.package(url: "https://github.com/livekit/client-sdk-swift.git", from: "2.0.0"),
.package(url: "https://github.com/devicekit/DeviceKit.git", from: "5.8.0"),
.package(url: "https://github.com/devicekit/DeviceKit.git", from: "5.6.0"),
],
targets: [
.target(
Expand Down
376 changes: 215 additions & 161 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,161 +1,215 @@
# Elevenlabs Conversational AI Swift SDK

![convai222](https://github.com/user-attachments/assets/ca4fa726-5e98-4bbc-91b2-d055e957df7d)

Elevenlabs Conversational AI Swift SDK is a framework designed to integrate ElevenLabs' powerful conversational AI capabilities into your Swift applications. Leverage advanced audio processing and seamless WebSocket communication to create interactive and intelligent conversational voice experiences.

For detailed documentation, visit the [ElevenLabs Swift SDK documentation](https://elevenlabs.io/docs/conversational-ai/libraries/conversational-ai-sdk-swift).

> [!NOTE]
> This library is launching to primarily support Conversational AI. The support for speech synthesis and other more generic use cases is planned for the future.

## Install

Add the Elevenlabs Conversational AI Swift SDK to your project using Swift Package Manager:

1. Open Your Project in Xcode
- Navigate to your project directory and open it in Xcode.
2. Add Package Dependency
- Go to `File` > `Add Packages...`
3. Enter Repository URL
- Input the following URL: `https://github.com/elevenlabs/ElevenLabsSwift`
4. Select Version
5. Import the SDK
```swift
import ElevenLabsSDK
```
6. Ensure `NSMicrophoneUsageDescription` is added to your Info.plist to explain microphone access.

## Usage

### Setting Up a Conversation Session

1. Configure the Session
Create a `SessionConfig` with either an `agentId` or `signedUrl`.

```swift
let config = ElevenLabsSDK.SessionConfig(agentId: "your-agent-id")
```

2. Define Callbacks
Implement callbacks to handle various conversation events.

```swift
var callbacks = ElevenLabsSDK.Callbacks()
callbacks.onConnect = { conversationId in
print("Connected with ID: \(conversationId)")
}
callbacks.onMessage = { message, role in
print("\(role.rawValue): \(message)")
}
callbacks.onError = { error, info in
print("Error: \(error), Info: \(String(describing: info))")
}
callbacks.onStatusChange = { status in
print("Status changed to: \(status.rawValue)")
}
callbacks.onModeChange = { mode in
print("Mode changed to: \(mode.rawValue)")
}
callbacks.onVolumeUpdate = { volume in
print("Input volume: \(volume)")
}
callbacks.onOutputVolumeUpdate = { volume in
print("Output volume: \(volume)")
}
callbacks.onMessageCorrection = { original, corrected, role in
print("Message corrected - Original: \(original), Corrected: \(corrected), Role: \(role.rawValue)")
}
```

3. Start the Conversation
Initiate the conversation session asynchronously.

```swift
Task {
do {
let conversation = try await ElevenLabsSDK.startSession(config: config, callbacks: callbacks)
// Use the conversation instance as needed
} catch {
print("Failed to start conversation: \(error)")
}
}
```

### Advanced Configuration

1. Using Client Tools

```swift
var clientTools = ElevenLabsSDK.ClientTools()
clientTools.register("weather") { parameters in
print("Weather parameters received:", parameters)
// Handle the weather tool call and return response
return "The weather is sunny today"
}

let conversation = try await ElevenLabsSDK.startSession(
config: config,
callbacks: callbacks,
clientTools: clientTools
)
```

2. Using Overrides

```swift
let overrides = ElevenLabsSDK.ConversationConfigOverride(
agent: ElevenLabsSDK.AgentConfig(
prompt: ElevenLabsSDK.AgentPrompt(prompt: "You are a helpful assistant"),
language: .en
)
)

let config = ElevenLabsSDK.SessionConfig(
agentId: "your-agent-id",
overrides: overrides
)
```

### Managing the Session

- End Session

```swift
conversation.endSession()
```

- Control Recording

```swift
conversation.startRecording()
conversation.stopRecording()
```

- Send Messages and Updates

```swift
// Send a contextual update to the conversation
conversation.sendContextualUpdate("The user is now in the kitchen")

// Send a user message directly
conversation.sendUserMessage("Hello, how are you?")

// Send user activity signal
conversation.sendUserActivity()
```

- Volume Controls

```swift
// Get current input/output volume levels
let inputVolume = conversation.getInputVolume()
let outputVolume = conversation.getOutputVolume()

// Set conversation volume (0.0 to 1.0)
conversation.conversationVolume = 0.8
```

## Example

Explore examples in the [ElevenLabs Examples repository](https://github.com/elevenlabs/elevenlabs-examples/tree/main/examples/conversational-ai/swift).
# ElevenLabs Conversational AI Swift SDK

<img src="https://github.com/user-attachments/assets/ca4fa726-5e98-4bbc-91b2-d055e957df7d" alt="ElevenLabs ConvAI" width="400">

A Swift SDK for integrating ElevenLabs' conversational AI capabilities into your iOS and macOS applications. Built on top of LiveKit WebRTC for real-time audio streaming and communication.

[![Swift Package Manager](https://img.shields.io/badge/Swift%20Package%20Manager-compatible-brightgreen.svg)](https://github.com/apple/swift-package-manager)
[![Platform](https://img.shields.io/badge/platform-iOS%20%7C%20macOS-lightgrey.svg)](https://github.com/elevenlabs/ElevenLabsSwift)

## Quick Start

### Installation

Add to your project using Swift Package Manager:

```swift
dependencies: [
.package(url: "https://github.com/elevenlabs/ElevenLabsSwift.git", from: "1.2.0")
]
```

### Basic Usage

```swift
import ElevenLabsSwift

// 1. Configure your session
let config = ElevenLabsSDK.SessionConfig(agentId: "your-agent-id")

// 2. Set up callbacks
var callbacks = ElevenLabsSDK.Callbacks()
callbacks.onConnect = { conversationId in
print("🟢 Connected: \(conversationId)")
}
callbacks.onMessage = { message, role in
print("💬 \(role.rawValue): \(message)")
}
callbacks.onError = { error, _ in
print("❌ Error: \(error)")
}

// 3. Start conversation
Task {
do {
let conversation = try await ElevenLabsSDK.startSession(
config: config,
callbacks: callbacks
)

// Send messages
conversation.sendUserMessage("Hello!")
conversation.sendContextualUpdate("User is in the kitchen")

// Control recording
conversation.startRecording()
conversation.stopRecording()

// End session
conversation.endSession()
} catch {
print("Failed to start conversation: \(error)")
}
}
```

### Requirements

- iOS 16.0+ / macOS 10.15+
- Swift 5.9+
- Add `NSMicrophoneUsageDescription` to your Info.plist

## Advanced Features

### Private agents

For private agents that require authentication, provide a conversation token in your `SessionConfig`.

The conversation token should be generated on your backend with a valid ElevenLabs API key. Do NOT store the API key within your app.

```js
// Node.js server
app.get("/api/conversation-token", yourAuthMiddleware, async (req, res) => {
const response = await fetch(
`https://api.elevenlabs.io/v1/convai/conversation/token?agent_id=${process.env.AGENT_ID}`,
{
headers: {
// Requesting a conversation token requires your ElevenLabs API key
// Do NOT expose your API key to the client!
'xi-api-key': process.env.ELEVENLABS_API_KEY,
}
}
);

if (!response.ok) {
return res.status(500).send("Failed to get conversation token");
}

const body = await response.json();
res.send(body.token);
);
```

```swift
guard let url = URL(string: "https://your-backend-api.com/api/conversation-token") else {
throw URLError(.badURL)
}

// Create request. This is a simple implementation, in a real world app you should add security headers
var request = URLRequest(url: url)
request.httpMethod = "GET"

// Make request
let (data, _) = try await URLSession.shared.data(for: request)

// Parse response
let response = try JSONDecoder().decode([String: String].self, from: data)
guard let conversationToken = response["conversationToken"] else {
throw NSError(domain: "TokenError", code: 0, userInfo: [NSLocalizedDescriptionKey: "No token received"])
}

// Agent ID isn't required when providing a conversation token
let config = ElevenLabsSDK.SessionConfig(conversationToken: conversationToken)

let conversation = try await ElevenLabsSDK.startSession(config: config)
```

### Client Tools

Register custom tools that your agent can call:

```swift
var clientTools = ElevenLabsSDK.ClientTools()
clientTools.register("get_weather") { parameters in
let location = parameters["location"] as? String ?? "Unknown"
return "The weather in \(location) is sunny, 72°F"
}

let conversation = try await ElevenLabsSDK.startSession(
config: config,
callbacks: callbacks,
clientTools: clientTools
)
```

### Agent Configuration

Override agent settings:

```swift
let overrides = ElevenLabsSDK.ConversationConfigOverride(
agent: ElevenLabsSDK.AgentConfig(
prompt: ElevenLabsSDK.AgentPrompt(prompt: "You are a helpful cooking assistant"),
language: .en,
firstMessage: "Hello! How can I help you cook today?"
),
tts: ElevenLabsSDK.TTSConfig(voiceId: "your-voice-id")
)

let config = ElevenLabsSDK.SessionConfig(
agentId: "your-agent-id",
overrides: overrides
)
```

### Audio Controls

```swift
// Volume management
conversation.conversationVolume = 0.8
let inputLevel = conversation.getInputVolume()
let outputLevel = conversation.getOutputVolume()

// Recording controls
conversation.startRecording()
conversation.stopRecording()

// Volume callbacks
callbacks.onVolumeUpdate = { level in
print("🎤 Input: \(level)")
}
callbacks.onOutputVolumeUpdate = { level in
print("🔊 Output: \(level)")
}
```

## Architecture

The SDK is built with clean architecture principles:

```
ElevenLabsSDK (Main API)
├── LiveKitConversation (WebRTC Management)
├── RTCLiveKitAudioManager (Audio Streaming)
├── DataChannelManager (Message Handling)
└── NetworkService (Token Management)
```

## Examples

Check out complete examples in the [ElevenLabs Examples repository](https://github.com/elevenlabs/elevenlabs-examples/tree/main/examples/conversational-ai/swift).

## Contributing

We welcome contributions! Please check out our [Contributing Guide](CONTRIBUTING.md) and join us in the [ElevenLabs Discord](https://discord.gg/elevenlabs).

## License

This SDK is licensed under the MIT License. See [LICENSE](LICENSE) for details.

## Support

- 📚 [Documentation](https://elevenlabs.io/docs/conversational-ai/libraries/conversational-ai-sdk-swift)
- 💬 [Discord Community](https://discord.gg/elevenlabs)
- 🐛 [Issues](https://github.com/elevenlabs/ElevenLabsSwift/issues)
- 📧 [Support Email](mailto:[email protected])
Loading
Loading