Build a production-ready web application that showcases the full capabilities of Anam personas. This comprehensive guide walks you through creating an interactive AI assistant with modern UI components, real-time event handling, chat history, talk commands, and production-grade security practices.
How to use this guide
This comprehensive guide takes an additive approach, covering a wide range of Anam features and implementation patterns. Each section builds upon the previous ones, allowing you to create increasingly sophisticated functionality.
We intentionally provide detailed, complete code examples to make integration with AI coding assistants seamless. Use the buttons in the top-right corner of this page to open the content in your preferred LLM or copy the entire guide for easy reference.
You likely won’t need every component shown in this guide. After completing the initial setup (Steps 1-4), use the right-hand navigation to jump directly to the specific features you want to implement:
Start with the basic setup, then selectively add the advanced features that match your use case.
What You’ll Build
By the end of this guide, you’ll have a sophisticated persona application featuring:
- Modern UI/UX with loading states, connection status, and responsive design
- Chat History Panel showing conversation transcripts in real-time
- Advanced Event Handling for connection management and user feedback
- Talk Commands for programmatic persona control
- Audio Management with mute/unmute functionality
- Production Security with proper authentication and rate limiting
- Error Handling & Retry Logic for robust user experience
Prerequisites
- Node.js (version 18 or higher) and npm installed
- Understanding of modern JavaScript/TypeScript
- An Anam API key (get one here)
- Basic knowledge of Express.js and modern web development
- A microphone and speakers for voice interaction
Basic App
Let’s start by building a simple working version, then enhance it with advanced features. This foundational setup creates a minimal web application with three main files:
anam-production-app/
├── server.js # Express server for secure API key handling
├── package.json # Node.js dependencies
├── public/ # Static files served to the browser
│ ├── index.html # Main HTML page with video element
│ └── script.js # Client-side JavaScript for persona control
└── .env # Environment variables
Create project directory
mkdir anam-production-app
cd anam-production-app
Initialize Node.js project
This creates a package.json
file for managing dependencies.
Create public directory
The public
folder will contain your HTML and JavaScript files that are served to the browser.
Install dependencies
npm install express dotenv
We’re installing Express for the server and dotenv for environment variable management. The Anam SDK will be loaded directly from a CDN in the browser.
Configure environment variables
Create a .env
file in your project root to store your API key securely:
ANAM_API_KEY=your-api-key-here
Replace your-api-key-here
with your actual Anam API key. Never commit this file to version control.
Step 1: Set up your server
Create a basic Express server to handle session token generation:
require('dotenv').config();
const express = require('express');
const app = express();
app.use(express.json());
app.use(express.static('public'));
app.post('/api/session-token', async (req, res) => {
try {
const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
},
body: JSON.stringify({
personaConfig: {
name: "Cara",
avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
brainType: "ANAM_GPT_4O_MINI_V1",
systemPrompt: "You are Cara, a helpful AI assistant. Be friendly, concise, and helpful in your responses.",
},
}),
});
const data = await response.json();
res.json({ sessionToken: data.sessionToken });
} catch (error) {
res.status(500).json({ error: 'Failed to create session' });
}
});
app.listen(8000, () => {
console.log('Server running on http://localhost:8000');
});
Step 2: Set up your HTML
Create a simple HTML page with a video element and basic controls:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Anam AI Assistant - Production App</title>
</head>
<body>
<div style="text-align: center; padding: 20px;">
<h1>Chat with Cara</h1>
<video id="persona-video" autoplay playsinline style="max-width: 100%; border-radius: 8px;"></video>
<div style="margin-top: 20px;">
<button id="start-button">Start Chat</button>
<button id="stop-button" disabled>Stop Chat</button>
</div>
</div>
<script type="module" src="script.js"></script>
</body>
</html>
Step 3: Initialize the Anam client
Create the client-side JavaScript to control your persona:
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";
let anamClient = null;
// Get DOM elements
const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");
async function startChat() {
try {
startButton.disabled = true;
// Get session token from your server
const response = await fetch("/api/session-token", {
method: "POST",
});
const { sessionToken } = await response.json();
// Create the Anam client
anamClient = createClient(sessionToken);
// Start streaming to the video element
await anamClient.streamToVideoElement("persona-video");
// Update button states
startButton.disabled = true;
stopButton.disabled = false;
console.log("Chat started successfully!");
} catch (error) {
console.error("Failed to start chat:", error);
startButton.disabled = false;
}
}
function stopChat() {
if (anamClient) {
// Disconnect the client
anamClient.stopStreaming();
anamClient = null;
// Clear video element
videoElement.srcObject = null;
// Update button states
startButton.disabled = false;
stopButton.disabled = true;
console.log("Chat stopped.");
}
}
// Add event listeners
startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
Step 4: Test your basic setup
- Start your server:
-
Open http://localhost:8000 in your browser
-
Click “Start Chat” to begin your conversation!
You should see Cara appear in the video element and be ready to chat through voice interaction.
Now that you have a working basic app, let’s enhance it with production-ready features.
Listening to events
Anam personas communicate through an event system that allows your application to respond to connection changes, conversation updates, and user interactions. Let’s enhance our basic app with event handling to create a more responsive experience.
Understanding the Event System
The Anam SDK uses an event-driven architecture where you can listen for specific events and react accordingly. Here’s the process for listening to events:
Import the event types
Import the specific event types you want to listen to:
import { AnamEvent } from "https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types";
Define your event listener function
Create a function to handle the event:
function onSessionReady() {
console.log('Session Ready!');
// Do something when the session is ready
}
Add your event listener to the client
Register your listener function with the client:
anamClient.addListener(AnamEvent.SESSION_READY, onSessionReady);
Now let’s implement this pattern in our app by adding a loading state.
Adding a loading state
Loading states provide important user feedback during connection establishment. Let’s add a simple loading indicator to our HTML:
Step 1: Update the UI
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Anam AI Assistant - Production App</title>
</head>
<body>
<div style="text-align: center; padding: 20px;">
<h1>🤖 Chat with Cara</h1>
<div id="loading-message" style="display: none;">Loading...</div>
<video id="persona-video" autoplay playsinline style="max-width: 100%; border-radius: 8px;"></video>
<div style="margin-top: 20px;">
<button id="start-button">Start Chat</button>
<button id="stop-button" disabled>Stop Chat</button>
</div>
</div>
<script type="module" src="script.js"></script>
</body>
</html>
Step 2: Adding the event listener
Now let’s update our JavaScript to handle the loading state by adding the following to our script.js
file:
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";
import { AnamEvent } from "https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types";
let anamClient = null;
// Get DOM elements
const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");
const loadingMessage = document.getElementById("loading-message");
function showLoadingState() {
if (loadingMessage) {
loadingMessage.style.display = 'block';
}
}
function hideLoadingState() {
if (loadingMessage) {
loadingMessage.style.display = 'none';
}
}
async function startChat() {
try {
startButton.disabled = true;
showLoadingState();
// Get session token from your server
const response = await fetch("/api/session-token", {
method: "POST",
});
const { sessionToken } = await response.json();
// Create the Anam client
anamClient = createClient(sessionToken);
// Listen for SESSION_READY event to hide loading state
anamClient.addListener(AnamEvent.SESSION_READY, () => {
console.log('Session is ready!');
hideLoadingState();
startButton.disabled = true;
stopButton.disabled = false;
});
// Start streaming to the video element
await anamClient.streamToVideoElement("persona-video");
console.log("Chat started successfully!");
} catch (error) {
console.error("Failed to start chat:", error);
startButton.disabled = false;
hideLoadingState();
}
}
function stopChat() {
if (anamClient) {
// Disconnect the client
anamClient.stopStreaming();
anamClient = null;
// Clear video element
videoElement.srcObject = null;
// Update button states
startButton.disabled = false;
stopButton.disabled = true;
console.log("Chat stopped.");
}
}
// Add event listeners
startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
Adding a chat history panel
A common use case for event listeners is to update the UI based on the transcription of the conversation. Let’s look at how we can implement this using Anam events.
Understanding Message Events
Anam provides two key events for tracking conversation transcriptions:
Event | Description |
---|
MESSAGE_HISTORY_UPDATED | Provides the complete conversation history each time someone finishes speaking |
MESSAGE_STREAM_EVENT_RECEIVED | Provides real-time transcription updates as speech occurs |
Let’s start with the MESSAGE_HISTORY_UPDATED
event to build a basic chat history.
Step 1: Add chat history UI
First, update your HTML to include a simple chat panel:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Anam AI Assistant - Production App</title>
</head>
<body>
<div style="text-align: center; padding: 20px;">
<h1>🤖 Chat with Cara</h1>
<div id="loading-message" style="display: none;">Loading...</div>
<video id="persona-video" autoplay playsinline style="max-width: 100%; border-radius: 8px;"></video>
<!-- Chat History Panel -->
<div id="chat-history" style="margin-top: 20px; padding: 10px; border: 1px solid #ccc; border-radius: 5px; max-height: 200px; overflow-y: auto;">
<p>Start a conversation to see your chat history</p>
</div>
<div style="margin-top: 20px;">
<button id="start-button">Start Chat</button>
<button id="stop-button" disabled>Stop Chat</button>
</div>
</div>
<script type="module" src="script.js"></script>
</body>
</html>
Step 2: Listen for updates
Now let’s add the event listener to handle complete conversation updates:
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";
import { AnamEvent } from "https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types";
let anamClient = null;
// Get DOM elements
const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");
const loadingMessage = document.getElementById("loading-message");
const chatHistory = document.getElementById("chat-history");
function showLoadingState() {
if (loadingMessage) {
loadingMessage.style.display = 'block';
}
}
function hideLoadingState() {
if (loadingMessage) {
loadingMessage.style.display = 'none';
}
}
function updateChatHistory(messages) {
if (!chatHistory) return;
// Clear existing content
chatHistory.innerHTML = '';
if (messages.length === 0) {
chatHistory.innerHTML = '<p>Start a conversation to see your chat history</p>';
return;
}
// Add each message to the chat history
messages.forEach(message => {
const messageDiv = document.createElement('div');
messageDiv.style.marginBottom = '10px';
messageDiv.style.padding = '5px';
messageDiv.style.borderRadius = '5px';
if (message.role === 'user') {
messageDiv.style.backgroundColor = '#e3f2fd';
messageDiv.innerHTML = `<strong>You:</strong> ${message.content}`;
} else {
messageDiv.style.backgroundColor = '#f1f8e9';
messageDiv.innerHTML = `<strong>Cara:</strong> ${message.content}`;
}
chatHistory.appendChild(messageDiv);
});
// Scroll to bottom
chatHistory.scrollTop = chatHistory.scrollHeight;
}
async function startChat() {
try {
startButton.disabled = true;
showLoadingState();
// Get session token from your server
const response = await fetch("/api/session-token", {
method: "POST",
});
const { sessionToken } = await response.json();
// Create the Anam client
anamClient = createClient(sessionToken);
// Listen for SESSION_READY event to hide loading state
anamClient.addListener(AnamEvent.SESSION_READY, () => {
console.log('Session is ready!');
hideLoadingState();
startButton.disabled = true;
stopButton.disabled = false;
});
// Listen for MESSAGE_HISTORY_UPDATED to update chat history
anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
console.log('Conversation updated:', messages);
updateChatHistory(messages);
});
// Start streaming to the video element
await anamClient.streamToVideoElement("persona-video");
console.log("Chat started successfully!");
} catch (error) {
console.error("Failed to start chat:", error);
startButton.disabled = false;
hideLoadingState();
}
}
function stopChat() {
if (anamClient) {
// Disconnect the client
anamClient.stopStreaming();
anamClient = null;
// Clear video element and chat history
videoElement.srcObject = null;
updateChatHistory([]);
// Update button states
startButton.disabled = false;
stopButton.disabled = true;
console.log("Chat stopped.");
}
}
// Add event listeners
startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
The MESSAGE_HISTORY_UPDATED
event provides an array of message objects with role
(“user” or “assistant”) and content
properties. This gives you the complete conversation transcript each time someone finishes speaking.
Step 3: Add real-time transcription
Now let’s enhance the experience by showing live transcription as the persona speaks using the MESSAGE_STREAM_EVENT_RECEIVED
event:
<!-- Add this to your HTML after the chat history -->
<div id="live-transcript" style="margin-top: 10px; padding: 10px; background-color: #e6e0ff; border-radius: 5px;">
<strong>Persona Live:</strong> <span id="transcript-text"></span>
</div>
And update your JavaScript to handle real-time events:
// Add this event listener inside your startChat function after the MESSAGE_HISTORY_UPDATED listener
// Listen for real-time transcription events
anamClient.addListener(AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED, (event) => {
const liveTranscript = document.getElementById("live-transcript");
const transcriptText = document.getElementById("transcript-text");
console.log("event", event);
if (event.role === "persona") {
// Show persona speaking in real-time
if (liveTranscript && transcriptText) {
transcriptText.textContent =
transcriptText.textContent + event.content;
}
} else if (event.role === "user") {
// clear the persona live transcript when the user speaks
if (liveTranscript && transcriptText) {
transcriptText.textContent = "";
}
}
});
The MESSAGE_STREAM_EVENT_RECEIVED
event fires continuously as speech is being processed. Use event.type
to distinguish between ‘persona’ (AI speaking) and ‘user’ (user speaking) events.
Your app now displays both complete conversation history and real-time transcription as the persona speaks!
Sending commands
Beyond voice interaction, you can programmatically send messages to your persona using the talk command. This is useful for creating interactive experiences, automated workflows, or custom UI controls.
Using the talk command
The talk command allows you to send text messages that the persona will speak directly. This can be used to instruct the persona to say something in response to a user action other than voice input, such as a button click or when certain UI elements are shown on screen.
You can send a talk command by calling the talk
method on the Anam client during a session.
anamClient.talk("Hello, how are you?");
Let’s use this to add a talk command UI to our app.
Step 1: Update the UI
First, update your HTML to include a text input and send button
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Anam AI Assistant - Production App</title>
</head>
<body>
<div style="text-align: center; padding: 20px;">
<h1>🤖 Chat with Cara</h1>
<div id="loading-message" style="display: none;">Loading...</div>
<video id="persona-video" autoplay playsinline style="max-width: 100%; border-radius: 8px;"></video>
<!-- Chat History Panel -->
<div id="chat-history" style="margin-top: 20px; padding: 10px; border: 1px solid #ccc; border-radius: 5px; max-height: 200px; overflow-y: auto;">
<p>Start a conversation to see your chat history</p>
</div>
<!-- Live Transcript -->
<div id="live-transcript" style="margin-top: 10px; padding: 10px; background-color: #e6e0ff; border-radius: 5px;">
<strong>Persona Live:</strong> <span id="transcript-text"></span>
</div>
<!-- Talk Command Panel -->
<div style="margin-top: 20px; padding: 15px; border: 1px solid #ccc; border-radius: 5px;">
<h3 style="margin: 0 0 10px 0;">💬 Send Message</h3>
<div style="display: flex; gap: 10px; align-items: flex-end;">
<textarea
id="message-input"
placeholder="Type a message for Cara to respond to..."
rows="3"
style="flex: 1; padding: 10px; border: 1px solid #ddd; border-radius: 4px; font-family: inherit;"
></textarea>
<button id="send-message" disabled style="padding: 10px 20px; border: none; border-radius: 4px; background: #4CAF50; color: white; cursor: pointer;">
Send Message
</button>
</div>
</div>
<div style="margin-top: 20px;">
<button id="start-button">Start Chat</button>
<button id="stop-button" disabled>Stop Chat</button>
</div>
</div>
<script type="module" src="script.js"></script>
</body>
</html>
Step 2: Add the code
Now let’s add the code to our script.js
file to send the command to the persona
// Add these functions to your existing script.js file
async function sendTalkMessage() {
const messageInput = document.getElementById('message-input');
const sendButton = document.getElementById('send-message');
if (!anamClient || !messageInput) return;
const message = messageInput.value.trim();
if (!message) {
alert('Please enter a message');
return;
}
try {
sendButton.disabled = true;
sendButton.textContent = 'Sending...';
// Send the message to the persona
await anamClient.talk(message);
// Clear the input
messageInput.value = '';
} catch (error) {
console.error('Failed to send message:', error);
alert('Failed to send message. Please try again.');
} finally {
sendButton.disabled = false;
sendButton.textContent = 'Send Message';
}
}
function updateTalkControls(connected) {
const sendButton = document.getElementById('send-message');
const messageInput = document.getElementById('message-input');
if (sendButton) {
sendButton.disabled = !connected;
sendButton.style.opacity = connected ? '1' : '0.5';
}
if (messageInput) {
messageInput.disabled = !connected;
messageInput.placeholder = connected
? 'Type a message for Cara to respond to...'
: 'Connect to start chatting...';
}
}
// Update your startChat function to enable talk controls
async function startChat() {
try {
startButton.disabled = true;
showLoadingState();
const response = await fetch("/api/session-token", {
method: "POST",
});
const { sessionToken } = await response.json();
anamClient = createClient(sessionToken);
anamClient.addListener(AnamEvent.SESSION_READY, () => {
console.log('Session is ready!');
hideLoadingState();
startButton.disabled = true;
stopButton.disabled = false;
updateTalkControls(true); // Enable talk controls
});
anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
console.log('Conversation updated:', messages);
updateChatHistory(messages);
});
anamClient.addListener(AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED, (event) => {
const liveTranscript = document.getElementById("live-transcript");
const transcriptText = document.getElementById("transcript-text");
if (event.role === "persona") {
if (liveTranscript && transcriptText) {
transcriptText.textContent = transcriptText.textContent + event.content;
}
} else if (event.role === "user") {
if (liveTranscript && transcriptText) {
transcriptText.textContent = "";
}
}
});
await anamClient.streamToVideoElement("persona-video");
console.log("Chat started successfully!");
} catch (error) {
console.error("Failed to start chat:", error);
startButton.disabled = false;
hideLoadingState();
}
}
// Add event listeners for talk functionality
document.addEventListener('DOMContentLoaded', () => {
const sendButton = document.getElementById('send-message');
const messageInput = document.getElementById('message-input');
if (sendButton) {
sendButton.addEventListener('click', sendTalkMessage);
}
if (messageInput) {
// Send message with Enter key (Shift+Enter for new line)
messageInput.addEventListener('keydown', (e) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault();
sendTalkMessage();
}
});
}
});
Advanced Stream Management
For applications that require more control over audio and video streams, you have two main approaches for customizing the streaming behavior:
- Custom Input Streams: Pass your own audio input to
streamToVideoElement
- Output Stream Capture: Use
stream()
to capture and process persona output
The streamToVideoElement()
method accepts an optional audio input stream, allowing you to process or record user audio before sending it to the persona:
// Get and potentially process user audio
const userInputStream = await navigator.mediaDevices.getUserMedia({ audio: true });
// Start recording user input
const userRecorder = new MediaRecorder(userInputStream);
userRecorder.start();
// Use the same stream for persona input
await anamClient.streamToVideoElement("persona-video", userInputStream);
This approach is useful when you want to:
- Record user audio input
- Apply audio filters or effects to user input
- Monitor user audio levels
- Handle custom microphone setups
Output Stream Capture
For capturing and processing the persona’s output (video and audio), use the stream()
method which returns the raw output streams:
const userInputStream = await navigator.mediaDevices.getUserMedia({ audio: true });
const [videoStream] = await anamClient.stream(userInputStream);
// Now you have full control over the output stream
videoElement.srcObject = videoStream;
// Extract audio for separate processing
const audioTracks = videoStream.getAudioTracks();
const personaAudioStream = new MediaStream(audioTracks);
This approach enables:
- Recording persona video and audio output
- Custom video rendering and effects
- Audio analysis and processing
- Streaming to multiple destinations
What You’ve Built
Congratulations! You’ve built a comprehensive Anam application that demonstrates both basic and advanced stream management capabilities. Your application now includes the core patterns needed for building production persona experiences: secure authentication, event-driven updates, real-time communication, programmatic control, and advanced media stream handling.
Core Persona Functionality: A complete WebRTC-based streaming connection that enables natural voice conversations with an AI persona, including video rendering and bidirectional audio communication.
Event-Driven Architecture: Robust event handling that responds to session state changes, connection events, and conversation updates in real-time using the Anam SDK’s event system.
Live Conversation Tracking: Real-time transcription display using MESSAGE_STREAM_EVENT_RECEIVED
events that shows what the persona is saying as they speak, plus complete conversation history through MESSAGE_HISTORY_UPDATED
events.
Programmatic Control: Talk command functionality that allows you to send text messages directly to the persona, enabling you to react to user interactions and trigger automated responses.
Session Management: Secure server-side session token generation that properly handles API key protection and provides temporary access tokens for client-side connections.
Loading State Management: Responsive loading indicators that use the SESSION_READY
event to provide proper user feedback during connection establishment and session initialization.
Advanced Stream Recording: Manual stream management with separate user and persona audio recording capabilities, enabling conversation logging, quality assurance, and custom audio processing workflows.
What’s Next
Ready to take your persona application further? Here are your next steps:
Explore Advanced Features: Continue reading our guides to learn about audio control, custom personas, and production deployment best practices.
Integrate Your Own AI: Want to use your own language model instead of Anam’s default brain? Check out our Custom LLM Integration guide to learn how to connect your persona to custom AI models and APIs.
Advanced Customization
Persona Configuration
Modify the getDefaultPersonaConfig()
method to customize:
- Avatar appearance and voice selection
- System prompts and personality traits
- Custom brain types and LLM integration
UI Theming
Update the CSS variables in app.css
to match your brand:
- Color schemes and gradients
- Typography and spacing
- Animation timing and effects
Event Integration
Extend the event handlers to integrate with your existing systems:
- Analytics and user behavior tracking
- Customer service platforms
- CRM and database systems
Next Steps
Troubleshooting