Using Custom Brains
Implement custom intelligence for your personas
Overview
Our of the box Anam provides an LLM to act as the brains of your personas. This is great for most use cases and helps produce the lowest possible latency when having conversations. However, you may already have your own LLM or other intelligence that you want to integrate with your persona. Anam allows you to do this by providing a way to disable the default brains and use the SDK’s talk()
method to give speech instructions directly to the persona.
The Process
Using your own custom brains requires three steps to configure:
- Disable the default Anam brains.
- Listen for updates to the conversation history to know when you need to respond.
- Use the
talk
method to instruct the persona to speak your custom response.
1. Disabling the Default Brain
You can turn off the Anam LLM by using the disableBrains
configuration option when initialising the Anam client. When disabled, the persona will simply listen to user input and only respond to explicit talk
commands.
2. Listening for Conversation History
In order to know when to respond, you’ll need to listen for updates to the conversation. You can do this by adding a listener for one of the following events.
Event Name | Description |
---|---|
MESSAGE_HISTORY_UPDATED | Called with the message history transcription each time the user or persona finishes speaking |
MESSAGE_STREAM_EVENT_RECEIVED | For persona speech: Updated with each transcribed chunk as the persona speaks For user speech: Updated with complete transcription once they finish speaking |
For more information about session events see the Events guide.
3. Using Talk Commands
When using your own brain implementation, you’ll need to control the persona’s responses using the talk
method. The most basic method is to pass the full string of your response to a single talk
call.
Even if you’re using Anam’s default brains you may still wish to respond to user interaction events such as clicking on an element with your application or viewing a specific part of a web page. The talk
command is a great way to do this.
Streaming Responses
For reduced latency when using custom LLMs, you can stream responses in chunks using the TalkMessageStream
:
Important Considerations
- One
TalkMessageStream
represents one conversation turn - A turn can end in two ways:
- End of speech: When
streamMessageChunk
is called withendOfSpeech
set totrue
or theendMessage()
method is called. - User interruption: When the user speaks during streaming.
- End of speech: When
- Once a turn ends, the
TalkMessageStream
object is closed and cannot be reused. - Create a new
TalkMessageStream
for each new turn.
Correlation IDs
For tracking purposes, you can attach a correlation ID to your talk streams by passing the optional correlationId
parameter to the createTalkMessageStream
method:
The correlationId
should be unique for every TalkMessageStream
instance. When a talk stream is interrupted, this ID will be included in the AnamEvent.TALK_STREAM_INTERRUPTED
event.
Handling Interrupts
When streaming talk commands your users may interrupt the persona and produce new speech events that you need to respond with. You can listen for interrupt events when implementing your own brain which can let you know when you no longer need to continue using the current talk stream.
Next Steps
Was this page helpful?