Frequently Asked Questions
Answers to common questions about Anam
What persona presets are currently available?
There are 6 default persona presets available in various backgrounds:
- Leo (leo_desk, leo_windowdesk, leo_windowsofacorner)
- Alister (alister_desk, alister_windowdesk, alister_windowsofa)
- Astrid (astrid_desk, astrid_windowdesk, astrid_windowsofacorner)
- Cara (cara_desk, cara_windowdesk, cara_windowsofa)
- Evelyn (evelyn_desk)
- Pablo (pablo_desk, pablo_windowdesk, pablo_windowsofa)
You can list available persona presets using the /v1/personas/presets
endpoint or view them when creating a persona in the Anam Lab.
How is usage time calculated and billed?
Usage time starts when you call the stream()
method and ends when that specific stream closes. The time is tracked in seconds and billed per minute. The starter plan includes:
- 60 free minutes per month
- $0.18/minute for overages
- Up to 5 concurrent conversations
- Access to 6 personas and multiple backgrounds
- Usage is billed retrospectively at the end of each month
What causes latency and how can I optimize it?
Latency can come from several sources:
- Connection setup time (usually 1-2 seconds, but can be up to 5 seconds)
- LLM processing time
- TTS generation
- Network conditions
To optimize latency:
- Use shorter initial sentences in responses
- Take advantage of phrase caching (repeated phrases will be faster)
- Consider using our turnkey solution instead of custom LLM for lowest latency
- Use the streaming API for custom LLM implementations
How do I handle multilingual conversations?
Current language support:
- Speech recognition currently struggles outside of English and will often translate non-English speech to English
- We have language selection coming soon to fix this
- TTS supports multiple languages but voice quality may vary
- System prompts can be set to specific languages
- Language handling is primarily controlled via the system prompt
- Auto-language detection is planned for future releases
Can I interrupt the persona while it’s speaking?
Yes, you can interrupt the persona in two ways:
- Send a new
talk()
command which will override the current speech - When using streaming, user speech will automatically interrupt the current stream
Note: Currently there isn’t a way to completely silence the persona mid-speech, but sending a short punctuation mark (like ”.” or ”!”) through the talk command can achieve a similar effect.
How do I integrate my own LLM?
See our custom brains guide for more information.
What are the browser compatibility requirements?
The SDK requires:
- Modern browser with WebRTC support
- Microphone permissions for audio input
- Autoplay capabilities for video/audio
- WebAssembly support
Safari/iOS notes:
- Requires explicit user interaction for audio playback
- May have additional security policy requirements
- WebKit engine has specific autoplay restrictions
The SDK is not currently supported by Firefox.
How do I monitor current usage?
Usage tracking options:
- Available in Anam Lab
- API endpoint for usage stats coming soon
- Session logs available in the Anam Lab
What’s the difference between development and production setup?
Development:
Production:
- Exchange API key for session token server-side
- Pass session token to client
How do I handle connection issues?
Common issues and solutions:
- For “403 Forbidden” errors, verify API key/session token
- If video doesn’t appear, check element IDs match exactly
- Connection timeouts may require retry logic
- Session tokens expire and need refresh
- Monitor
CONNECTION_CLOSED
events for network issues
What features are coming soon?
Near-term roadmap includes:
- One-shot model for custom persona creation
- Improved streaming support for custom LLMs
- Improved usage dashboard and analytics
- Enhanced multilingual support
- Function calling capabilities
- Additional persona options
Was this page helpful?