You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The weights for acoustic target features (viz. duration, log F0, and log F0 delta), are set to 0 by default, disabling these features for newly built unit-selection voices. These default values were initially hard-coded and never updated, which means that the weights can only be tuned (or just simply modified) by manually editing the mary/halfphoneUnitFeatureDefinition_ac.txt file after it is generated by the AcousticFeatureFileWriter voicebuilding component.
However, virtually all of our published voices do contain manually tweaked acoustic feature weights, enabling prosody for unit selection, and significantly reducing pitch discontinuities in the resulting synthesis output.
To wit,
$ for j in lib/*.jar; do echo $j; unzip -c $j *_ac.txt | tail -n5; done
lib/voice-bits3-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
500 linear | unit_duration
50 linear | unit_logf0
50 linear | unit_logf0delta
lib/voice-dfki-obadiah-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1500 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta
lib/voice-dfki-pavoque-neutral-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta
lib/voice-dfki-pavoque-styles-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta
lib/voice-dfki-poppy-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta
lib/voice-dfki-prudence-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta
lib/voice-dfki-spike-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
2000 linear | unit_duration
50 linear | unit_logf0
0 linear | unit_logf0delta
lib/voice-voxforge-ru-nsh-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
0 linear | unit_duration
0 linear | unit_logf0
0 linear | unit_logf0delta
Accordingly, I believe that enabling the acoustic features, by at least setting reasonable default weights for duration and log F0, would be an improvement. This is especially true when no weights tuning is applied, e.g., in unsupervised voicebuilding workflows.
The text was updated successfully, but these errors were encountered:
The weights for acoustic target features (viz. duration, log F0, and log F0 delta), are set to 0 by default, disabling these features for newly built unit-selection voices. These default values were initially hard-coded and never updated, which means that the weights can only be tuned (or just simply modified) by manually editing the
mary/halfphoneUnitFeatureDefinition_ac.txt
file after it is generated by the AcousticFeatureFileWriter voicebuilding component.However, virtually all of our published voices do contain manually tweaked acoustic feature weights, enabling prosody for unit selection, and significantly reducing pitch discontinuities in the resulting synthesis output.
To wit,
Accordingly, I believe that enabling the acoustic features, by at least setting reasonable default weights for duration and log F0, would be an improvement. This is especially true when no weights tuning is applied, e.g., in unsupervised voicebuilding workflows.
The text was updated successfully, but these errors were encountered: