Skip to main content
Anam supports multilingual conversations, allowing your personas to understand and respond to users speaking in their native language. By configuring the transcription language, you ensure accurate speech recognition and optimal response times for non-English speakers.

How It Works

When a user speaks to your persona, their speech is transcribed using speech-to-text technology. By default, the system expects English input. For non-English users, you can specify the expected language to dramatically improve:
  • Transcription accuracy: The system correctly interprets words and phrases in the target language
  • Response latency: Optimized processing for the specified language reduces delays
The languageCode controls what language the system expects to hear from users. This is separate from the persona’s voice, which determines what language the persona speaks.

When to Configure Language

For many use cases, the default English setting works well—even when users speak other languages. The LLM can still understand the meaning of what’s said and respond appropriately. Setting a specific languageCode becomes important when:
  • You need exact transcription: In English mode, non-English speech may be translated to English in the transcript rather than preserved in the original language. If your application displays transcripts or the persona needs to see the exact words spoken, set the matching language code.
  • You’re building language-focused experiences: For language learning apps, pronunciation practice, or any use case where the specific words matter as much as their meaning.
  • You want optimized latency: Configuring the correct language reduces processing time for non-English speech.

Setting the Language Code

Configure the transcription language when requesting a session token via POST /v1/auth/session-token. Add the languageCode field to your personaConfig:
const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
  },
  body: JSON.stringify({
    personaConfig: {
      name: "Marie",
      avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
      voiceId: "<french-voice-id>",
      llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
      systemPrompt:
        "Tu es une assistante virtuelle sympathique. Réponds toujours en français.",
      languageCode: "fr",
    },
  }),
});

Configuration Reference

languageCode
string
A 2-character ISO 639-1 language code specifying the expected user speech language. Defaults to en (English) if not provided.

Supported Languages

The following languages are supported for speech recognition:
LanguageCode
Englishen
Frenchfr
Germande
Spanishes
Portuguesept
Italianit
Dutchnl
Polishpl
Japaneseja
Koreanko
Chinese (Mandarin)zh
Serving primarily non-English users? Contact Anam support to set a default transcription language for your account. This means you won’t need to specify languageCode on every session token request—all sessions will automatically use your preferred language unless explicitly overridden.

Understanding Language Configuration

There are two language settings to consider when building multilingual experiences:

Transcription Language (What Users Speak)

The languageCode field tells the speech recognition system what language to expect from users. Set this to match your users’ spoken language for accurate transcription.

Voice Language (What the Persona Speaks)

The voiceId determines the voice and language your persona uses when speaking. Select a voice that matches your target language from the Voice Gallery or the Anam Lab.
For a fully localized experience, configure both the languageCode (for user speech recognition) and select an appropriate voiceId (for persona speech output).

Best Practices

Always set languageCode to match the language your users will speak. Mismatched settings result in poor transcription accuracy. If your German users speak German, set languageCode: "de".
The language code is set per session and cannot be changed mid-conversation. If your application supports multiple languages:
  • Create separate sessions with appropriate language codes for each user
  • Determine the user’s preferred language before starting the session
  • Consider using a language selection step in your onboarding flow
For the best multilingual experience:
  1. Set languageCode for accurate speech recognition
  2. Select a voiceId in the target language
  3. Write your systemPrompt in the target language or instruct the persona to respond in that language
If your application primarily serves users of one non-English language, contact Anam support to set an account-level default. Benefits include:
  • Simplified integration without passing languageCode every time
  • No changes needed to existing integrations
  • Per-session languageCode still overrides the default when needed

SDK Usage

When using the JavaScript SDK with ephemeral personas, the languageCode field in your PersonaConfig is passed through when creating session tokens:
import { createClient } from "@anam-ai/js-sdk";

// Your server creates the session token with languageCode
const sessionToken = await getSessionTokenFromServer({
  personaConfig: {
    name: "Hans",
    avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
    voiceId: "<german-voice-id>",
    llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
    systemPrompt: "Du bist ein hilfreicher Assistent. Antworte immer auf Deutsch.",
    languageCode: "de",
  },
});

// Client uses the token as normal
const anamClient = createClient(sessionToken);
await anamClient.streamToVideoAndAudioElements("video-id", "audio-id");
The languageCode configuration only applies to ephemeral persona flows where you provide the persona configuration at runtime. For stateful personas created in the Anam Lab, contact Anam support to configure language settings.

Next Steps