Build your own AI conversation logic with OpenAI, Llama, and other language models
Learn how to bypass Anam’s built-in language models and integrate your own custom LLM for complete control over conversation logic. This guide uses OpenAI as an example, but the pattern works with any LLM provider (Anthropic, Cohere, Hugging Face, Ollama, etc.).
New Feature: Anam now supports server-side custom LLMs where we
handle the LLM calls for you, improving latency and simplifying development. This guide shows the
client-side approach where you manage the LLM calls yourself.
By the end of this guide, you’ll have a sophisticated persona application featuring:
Custom AI Brain powered by your own language model (OpenAI GPT-4o-mini)
Streaming Responses with real-time text-to-speech conversion
Turn-taking Management that handles conversation flow naturally
Message History Integration that maintains conversation context
Production Security with proper API key handling and rate limiting
Error Handling & Recovery for robust user experience
Modern UI/UX with loading states and real-time feedback
After completing the initial setup (Steps 1-4), you can extend this foundation by adding features
like conversation memory, different LLM providers, custom system prompts, or specialized AI
behaviors.
This guide uses OpenAI’s GPT-4o-mini as an example custom LLM for demonstration purposes. In
your actual application, you would replace the OpenAI integration with calls to your specific LLM
provider. The core integration pattern remains the same regardless of your LLM choice.
Before diving into the implementation, it’s important to understand how custom LLM integration works with Anam personas. Regardless of your custom LLM provider, the implementation pattern will always follow these steps:
1
Disable Default Brain
The llmId: "CUSTOMER_CLIENT_V1" setting in the session token request disables Anam’s default AI, allowing you to handle all conversation logic.
2
Listen for User Input
The MESSAGE_HISTORY_UPDATED event fires when the user finishes speaking, providing the complete
conversation history including the new user message.
3
Process with Custom LLM
Your server endpoint receives the conversation history and generates a streaming response using
your chosen LLM (OpenAI in this example).
4
Stream to Persona
The LLM response is streamed back to the client and forwarded to the persona using createTalkMessageStream() for natural text-to-speech conversion.
Using these core concepts, we’ll build a simple web application that allows you to chat with your custom LLM-powered persona.
Let’s start by building the foundation with custom LLM integration. This setup creates a web application with four main components:
Copy
Ask AI
anam-custom-llm-app/├── server.js # Express server with streaming LLM endpoint├── package.json # Node.js dependencies├── public/ # Static files served to the browser│ ├── index.html # Main HTML page with video element│ └── script.js # Client-side JavaScript for persona control└── .env # Environment variables
1
Create project directory
Copy
Ask AI
mkdir anam-custom-llm-appcd anam-custom-llm-app
2
Initialize Node.js project
bash npm init -y This creates a package.json file for managing dependencies.
3
Create public directory
bash mkdir public
The public folder will contain your HTML and JavaScript files that are served to the browser.
4
Install dependencies
bash npm install express dotenv openai
We’re installing Express for the server, dotenv for environment variables, and the OpenAI SDK
for custom LLM integration. The Anam SDK will be loaded directly from a CDN in the browser.
5
Configure environment variables
Create a .env file in your project root to store your API keys securely:
Create an Express server that handles both session token generation and LLM streaming:
server.js
Copy
Ask AI
require('dotenv').config();const express = require('express');const OpenAI = require('openai');const app = express();// Initialize OpenAI clientconst openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY,});app.use(express.json());app.use(express.static('public'));// Session token endpoint with custom brain configurationapp.post('/api/session-token', async (req, res) => { try { const response = await fetch('https://api.anam.ai/v1/auth/session-token', { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.ANAM_API_KEY}`, }, body: JSON.stringify({ personaConfig: { name: 'Cara', avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18', voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b', // This disables Anam's default brain and enables custom LLM integration llmId: 'CUSTOMER_CLIENT_V1', }, }), }); const data = await response.json(); res.json({ sessionToken: data.sessionToken }); } catch (error) { console.error('Session token error:', error); res.status(500).json({ error: 'Failed to create session' }); }});// Custom LLM streaming endpointapp.post('/api/chat-stream', async (req, res) => { try { const { messages } = req.body; // Create a streaming response from OpenAI const stream = await openai.chat.completions.create({ model: 'gpt-4o-mini', messages: [ { role: 'system', content: 'You are Cara, a helpful AI assistant. Be friendly, concise, and conversational in your responses. Keep responses under 100 words unless specifically asked for detailed information.', }, ...messages, ], stream: true, temperature: 0.7, }); // Set headers for streaming response res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); // Process the OpenAI stream and forward to client for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content || ''; if (content) { // Send each chunk as JSON res.write(JSON.stringify({ content }) + '\n'); } } res.end(); } catch (error) { console.error('LLM streaming error:', error); res.status(500).json({ error: 'An error occurred while streaming response' }); }});app.listen(8000, () => { console.log('Server running on http://localhost:8000'); console.log('Custom LLM integration ready!');});
The key difference here is setting llmId: "CUSTOMER_CLIENT_V1" which disables Anam’s default AI
and enables custom LLM integration. The /api/chat-stream endpoint handles the actual AI
conversation logic.
Click “Start Conversation” to begin chatting with your custom LLM-powered persona!
You should see Cara appear and greet you, powered by your custom OpenAI integration. Try having a
conversation - your voice will be transcribed, sent to OpenAI’s GPT-4o-mini, and the response will
be streamed back through the persona’s voice and video.
Congratulations! You’ve successfully integrated a custom language model with Anam’s persona system. Your application now features:Custom AI Brain: Complete control over your persona’s intelligence using OpenAI’s GPT-4o-mini, with the ability to customize personality, knowledge, and behavior.Real-time Streaming: Responses stream naturally from your LLM through the persona’s voice, creating fluid conversations without noticeable delays.Conversation Context: Full conversation history is maintained and provided to your LLM for contextually aware responses.Production Ready: Error handling, retry logic, and proper API key management make this suitable for real-world deployment.Extensible Architecture: The modular design allows you to easily swap LLM providers, add custom logic, or integrate with other AI services.