enable acoustic features by default #465

psibre · 2016-02-01T05:59:56Z

The weights for acoustic target features (viz. duration, log F0, and log F0 delta), are set to 0 by default, disabling these features for newly built unit-selection voices. These default values were initially hard-coded and never updated, which means that the weights can only be tuned (or just simply modified) by manually editing the mary/halfphoneUnitFeatureDefinition_ac.txt file after it is generated by the AcousticFeatureFileWriter voicebuilding component.

However, virtually all of our published voices do contain manually tweaked acoustic feature weights, enabling prosody for unit selection, and significantly reducing pitch discontinuities in the resulting synthesis output.
To wit,

$ for j in lib/*.jar; do echo $j; unzip -c $j *_ac.txt | tail -n5; done
lib/voice-bits3-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
500 linear | unit_duration
50 linear | unit_logf0
50 linear | unit_logf0delta

lib/voice-dfki-obadiah-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1500 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-pavoque-neutral-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-pavoque-styles-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-poppy-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-prudence-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-spike-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
2000 linear | unit_duration
50 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-voxforge-ru-nsh-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
0 linear | unit_duration
0 linear | unit_logf0
0 linear | unit_logf0delta

Accordingly, I believe that enabling the acoustic features, by at least setting reasonable default weights for duration and log F0, would be an improvement. This is especially true when no weights tuning is applied, e.g., in unsupervised voicebuilding workflows.

The text was updated successfully, but these errors were encountered:

quickly fixes marytts#465

psibre added voicebuilding enhancement labels Feb 1, 2016

psibre self-assigned this Feb 1, 2016

psibre added this to the 5.2 milestone Feb 1, 2016

psibre added a commit to psibre/marytts that referenced this issue Feb 1, 2016

enable some acoustic features by setting their weights to > 0 by default

472b3bc

quickly fixes marytts#465

psibre closed this as completed in 33a3b71 Feb 1, 2016

davidflanagan mentioned this issue Mar 17, 2016

halfphoneUnitFeatureDefinition_ac.txt does not have any continuous features even though halfphoneFeatures_ac.mry does #503

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable acoustic features by default #465

enable acoustic features by default #465

psibre commented Feb 1, 2016

enable acoustic features by default #465

enable acoustic features by default #465

Comments

psibre commented Feb 1, 2016