Voice Assignment from a Voice Pool
- "Dynamic Voice Assignment" (DVA) is a function to assign Voice dynamically according to the priority.
The player handle acquires Voices from the "Voice Pool" when playing back the sound.
If there is no available Voice in the Voice Pool, it stops the playback of the voice of the lowest priority, and plays back the sound using that Voice.
Voice pool customization
Voice pool parameters
- When creating a voice pool, you can specify the following parameters:
Number of voices (num_voices)
- Specifies the number of voices to reserve in the voice pool.
This value is the upper limit of the number of voices per voice pool.
Maximum Number of Channels (player_config::max_channels)
- The upper limit of the number of channels that the voice supports.
The created voice can only play audio with a number of channels "less than" the value specified here.
Minimum Number of Channels (min_channels)
- The lower limit of the number of channels that the voice supports.
The created voice can only play audio with a number of channels "more than" the value specified here.
- You can also create a voice pool that supports only a specific number of channels by specifying the same value as the maximum number of channels mentioned above.
Maximum Sampling Rate (player_config::max_sampling_rate)
- The upper limit of the sampling rate that the voice supports.
The created voice can only play audio with a sampling rate equal to or less than the value specified here.
- Attention
- If you want to change the pitch during audio playback, you need to specify a sampling rate that takes the pitch change into account.
For example, if you want to play a 48kHz audio at twice the sampling rate (pitch +1200 cents), you need to specify 96000 for this parameter.
Whether to play streaming (player_config::streaming_flag)
- Specifies whether the voice is used for streaming playback.
If you specify CRI_TRUE, the created voice supports both memory playback and stream playback.
If you specify CRI_FALSE, the created voice supports memory playback only.
Whether to play streaming only (is_streaming_only)
- Specifies whether to limit the voice to streaming playback.
If CRI_TRUE is specified, the created voice will be used only for streaming playback, regardless of player_config::streaming_flag .
If CRI_FALSE is specified, the voice will behave as specified by player_config::streaming_flag .
Handling multiple voice pools
- You can create multiple voice pools.
For example, you can set different upper limits for the number of voices for each channel by using multiple voice pools, such as voice pool 1 with 100 voices for mono, voice pool 2 with 20 voices for stereo, and voice pool 3 with 4 voices for 5.1 channels.
For more detailed control, you can specify " \ref criatom_samples_voice_pool_identifier " to the player handle and explicitly specify the voice pool to be used for each voice.
- If multiple voice pools exist, the voice with the format closest to the audio to be played will be used preferentially.
For example, if there is a voice pool A that can only play mono audio, and a voice pool B that can play stereo audio, the voices in voice pool A will be used to play mono audio.
(Only if all the voices in voice pool A are in use, the voices in voice pool B will be used to play mono audio.)
- Supplementary information:
- The Atom library determines which voices to use for the audio to be played, taking into account the number of channels, sampling rate, whether or not streaming is possible, etc.
Specifically, the appropriate voice pool is determined in the following order of priority:
- Whether streaming is possible
- Maximum number of channels
- Minimum number of channels
- Sampling rate
- Number of playable codecs
For example, if there is a voice pool C for memory playback that supports a maximum of 96kHz, and a voice pool D that can stream playback that supports a maximum of 48kHz, and 24kHz audio is played from memory, the audio will be played using the voices in voice pool C in principle.
(Because the availability of streaming playback has a higher priority than the sampling rate as a criterion for selecting voices.)
Voice limit group
- This is basically how it works, but you can set more advanced settings.
For example, if you want to only play up to three gunshots at a time, use the Voice Limit Group.
Phonetic Priority
- Create a voice limit group called "Gunshot Group" and set the maximum number of sounds to three. Then, if you assign the voice limit group called "Gunshot Group" to various gunshots, the sounds belonging to that group will only be able to play up to three sounds at a time.
You can also set the behavior when the limit is reached for the voice limit group, such as First-come-first-served or Last-come-first-served.
For example, last-come-first-served is suitable for gunshots, and the latest pronunciation request will erase the oldest pronunciation.
On the other hand, "First-come-first-served" is suitable for dialogue, which prevents the dialogue being stopped by the next dialogue request.