Frequently Asked Questions
Answers to common questions about Anam
How is usage time calculated and billed?
Usage time starts when you call the stream()
method and ends when that specific stream closes. The time is tracked in seconds and billed per minute.
What causes latency and how can I optimize it?
Latency can come from several sources:
- Connection setup time (usually 1-2 seconds, but can be up to 5 seconds)
- LLM processing time
- TTS generation
- Face generation
- Network conditions
To optimize latency:
- Consider using our turnkey solution instead of custom-LLM
- Use the streaming API for custom LLM implementations
How do I handle multilingual conversations?
Current language support:
- Speech recognition currently struggles outside of English and will often translate non-English speech to English
- TTS supports multiple languages but voice quality may vary
- System prompts can be set to specific languages
- Language handling is primarily controlled via the system prompt
- Auto-language detection is planned for future releases
Can I interrupt the persona while it’s speaking?
Yes, you can interrupt the persona in two ways:
- Send a new
talk()
command which will override the current speech - When using streaming, user speech will automatically interrupt the current stream
Note: Currently there isn’t a way to completely silence the persona mid-speech, but sending a short punctuation mark (like ”.” or ”!”) through the talk command can achieve a similar effect.
How do I integrate my own LLM?
See our custom brains guide for more information.
What are the browser compatibility requirements?
The SDK should work well with any recent Chromium-based browser. It requires:
- Modern browser with WebRTC support
- Microphone permissions for audio input
- Autoplay capabilities for video/audio
- WebAssembly support
Safari/iOS notes:
- Requires explicit user interaction for audio playback
- May have additional security policy requirements
- WebKit engine has specific autoplay restrictions
The SDK is not currently supported by Firefox.
How do I monitor current usage?
Usage tracking options:
- Available in Anam Lab
- API endpoint for usage stats coming soon
- Session logs available in the Anam Lab
What’s the difference between development and production setup?
Development:
Production:
- Exchange API key for session token server-side
- Pass session token to client
How do I handle connection issues?
Common issues and solutions:
- For “403 Forbidden” errors, verify API key/session token
- If video doesn’t appear, check element IDs match exactly
- Connection timeouts may require retry logic
- Session tokens expire and need refresh
- Monitor
CONNECTION_CLOSED
events for network issues
What features are coming soon?
Near-term roadmap includes:
- Improved server-to-server streaming support for custom LLMs
- Improved usage dashboard and analytics
- Ability to create avatars from a single image
- Much more!