Custom LLMs

Anam now supports integration with custom Large Language Models (LLMs), giving you complete control over the AI powering your digital personas. This feature enables you to use your own models while benefiting from Anam’s persona, voice, and streaming infrastructure.
Custom LLMs are processed directly from Anam’s servers, improving latency and simplifying your development workflow. All API credentials you provide are encrypted for security.

How Custom LLMs Work

When you create a custom LLM configuration in Anam:
  1. Model Registration: You register your LLM details with Anam, including the model endpoint and authentication credentials
  2. Server-Side Processing: Anam handles all LLM calls from our servers, reducing latency and complexity
  3. Secure Storage: Your API keys and credentials are encrypted and securely stored
  4. Seamless Integration: Use your custom LLM ID in place of Anam’s built-in models

Creating a Custom LLM

To create a custom LLM, you’ll need to:
  1. Register your LLM configuration through the Anam API or dashboard
  2. Provide the necessary connection details (endpoint, API keys, model parameters)
  3. Receive a unique LLM ID for your custom model
  4. Use this ID when creating session tokens
Custom LLM creation API endpoints and dashboard features are coming soon. Contact support@anam.ai for early access.

Supported LLM Specifications

Anam supports custom LLMs that comply with one of the following API specifications:
  • OpenAI API Specification - Compatible with OpenAI’s chat completion endpoints
  • Azure OpenAI API Specification - Compatible with Azure’s OpenAI service endpoints
  • Gemini API Specification - Compatible with Google’s Gemini API endpoints
Your custom LLM must support streaming responses. Non-streaming LLMs will not work with Anam’s real-time persona interactions.

Technical Requirements

1

API Compatibility

Your LLM server must implement one of the supported API specifications mentioned above. This includes:
  • Matching the request/response format
  • Supporting the same authentication methods
  • Implementing compatible endpoint paths
2

Streaming Support

Enable streaming responses in your LLM implementation: - Return responses with stream: true support - Use Server-Sent Events (SSE) for streaming chunks - Include proper content types and formatting
3

Validation Testing

When you add your LLM in the Anam Lab, automatic tests verify:
  • API specification compliance
  • Streaming functionality
  • Response format compatibility
  • Authentication mechanisms
The Lab will provide feedback if your LLM doesn’t meet the requirements, helping you identify what needs to be fixed.
Testing Tip: We recommend using curl commands to compare your custom LLM’s raw HTTP responses with those from the actual providers (OpenAI, Azure OpenAI, or Gemini). Client libraries like the OpenAI SDK often transform responses and extract specific values, which can mask differences in the actual HTTP response format. Your custom implementation must match the raw HTTP response structure, not the transformed output from client libraries.

Example Custom LLM Endpoints

If you’re building your own LLM server, ensure your endpoints match one of these patterns:
POST /v1/chat/completions
Content-Type: application/json
Authorization: Bearer YOUR_API_KEY

{
"model": "your-model-name",
"messages": [...],
"stream": true
}

Using Custom LLMs

Once you have your custom LLM ID, use it when requesting session tokens:
const response = await fetch('https://api.anam.ai/v1/session-token', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.ANAM_API_KEY}`
  },
  body: JSON.stringify({
    personaConfig: {
      name: 'Sebastian',
      avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18',
      voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b',
      llmId: 'your-custom-llm-id', // Your custom LLM ID
      systemPrompt: 'You are a helpful customer service representative.',
    },
  })
});

Migration from brainType

The brainType parameter is deprecated and has been replaced with llmId to better reflect support for custom models. For backwards compatibility, you can pass your existing brainType value as the llmId and it will continue to work.

Migration Guide

1

Update your code

Replace all instances of brainType with llmId in your session token requests:
personaConfig: JSON.stringify({
- brainType: 'ANAM_GPT_4O_MINI_V1',
+ llmId: 'ANAM_GPT_4O_MINI_V1',
})
2

No functional changes needed

Your existing brain type values will work as LLM IDs: - ANAM_GPT_4O_MINI_V1 → Works as llmId - ANAM_LLAMA_v3_3_70B_V1 → Works as llmId - CUSTOMER_CLIENT_V1 → Works as llmId
3

Consider custom LLMs

Evaluate if a custom LLM would better serve your use case. Custom LLMs provide:
  • Full control over model behavior
  • Ability to use specialized or fine-tuned models
  • Consistent persona experience with your chosen AI

Available Built-in LLM IDs

These built-in models are available as LLM IDs:
LLM IDDescriptionBest For
ANAM_GPT_4O_MINI_V1OpenAI GPT-4 Mini modelAvailable for backwards compatibility
0934d97d-0c3a-4f33-91b0-5e136a0ef466OpenAI GPT-4.1 Mini modelRecommended for new projects
ANAM_LLAMA_v3_3_70B_V1Llama 3.3 70B modelOpen-source preference, larger context
9d8900ee-257d-4401-8817-ba9c835e9d36Gemini 2.5 Flash modelOur fastest model
CUSTOMER_CLIENT_V1Client-side LLMWhen you only use .talk() commands to speak

Security Considerations

Encryption at Rest: All API keys and credentials are encrypted using industry-standard encryption before storage.
Secure Transmission: Credentials are transmitted over HTTPS and never exposed in logs or responses.
Access Control: Only your account can use your custom LLM configurations.

Benefits of Server-Side Processing

By processing custom LLMs from Anam’s servers, you get:
  1. Reduced Latency: Direct server-to-server communication eliminates client-side round trips
  2. Simplified Development: No need to manage LLM connections in your client code
  3. Unified Streaming: Seamless integration with Anam’s voice and video streaming
  4. Credential Security: API keys never exposed to client-side code
  5. Automatic Scaling: Anam handles load balancing and scaling

Next Steps