How is usage time calculated and billed?

Usage time starts when you call the stream() method and ends when that specific stream closes. The time is tracked in seconds and billed per minute.

What causes latency and how can I optimize it?

Latency can come from several sources:

  • Connection setup time (usually 1-2 seconds, but can be up to 5 seconds)
  • LLM processing time
  • TTS generation
  • Face generation
  • Network conditions

To optimize latency:

  • Consider using our turnkey solution instead of custom-LLM
  • Use the streaming API for custom LLM implementations

How do I handle multilingual conversations?

Current language support:

  • Speech recognition currently struggles outside of English and will often translate non-English speech to English
  • TTS supports multiple languages but voice quality may vary
  • System prompts can be set to specific languages
  • Language handling is primarily controlled via the system prompt
  • Auto-language detection is planned for future releases

Can I interrupt the persona while it’s speaking?

Yes, you can interrupt the persona in two ways:

  1. Send a new talk() command which will override the current speech
  2. When using streaming, user speech will automatically interrupt the current stream

Note: Currently there isn’t a way to completely silence the persona mid-speech, but sending a short punctuation mark (like ”.” or ”!”) through the talk command can achieve a similar effect.

How do I integrate my own LLM?

See our custom brains guide for more information.

What are the browser compatibility requirements?

The SDK should work well with any recent Chromium-based browser. It requires:

  • Modern browser with WebRTC support
  • Microphone permissions for audio input
  • Autoplay capabilities for video/audio
  • WebAssembly support

Safari/iOS notes:

  • Requires explicit user interaction for audio playback
  • May have additional security policy requirements
  • WebKit engine has specific autoplay restrictions

The SDK is not currently supported by Firefox.

How do I monitor current usage?

Usage tracking options:

  • Available in Anam Lab
  • API endpoint for usage stats coming soon
  • Session logs available in the Anam Lab

What’s the difference between development and production setup?

Development:

const client = unsafe_createClientWithApiKey('your-api-key', {
  personaId: 'chosen-persona-id',
});

Production:

  1. Exchange API key for session token server-side
  2. Pass session token to client
const client = createClient('session-token', {
  personaId: 'chosen-persona-id',
});

How do I handle connection issues?

Common issues and solutions:

  • For “403 Forbidden” errors, verify API key/session token
  • If video doesn’t appear, check element IDs match exactly
  • Connection timeouts may require retry logic
  • Session tokens expire and need refresh
  • Monitor CONNECTION_CLOSED events for network issues

What features are coming soon?

Near-term roadmap includes:

  • Improved server-to-server streaming support for custom LLMs
  • Improved usage dashboard and analytics
  • Ability to create avatars from a single image
  • Much more!