Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable acoustic features by default #465

Closed
psibre opened this issue Feb 1, 2016 · 0 comments
Closed

enable acoustic features by default #465

psibre opened this issue Feb 1, 2016 · 0 comments
Assignees
Milestone

Comments

@psibre
Copy link
Member

psibre commented Feb 1, 2016

The weights for acoustic target features (viz. duration, log F0, and log F0 delta), are set to 0 by default, disabling these features for newly built unit-selection voices. These default values were initially hard-coded and never updated, which means that the weights can only be tuned (or just simply modified) by manually editing the mary/halfphoneUnitFeatureDefinition_ac.txt file after it is generated by the AcousticFeatureFileWriter voicebuilding component.

However, virtually all of our published voices do contain manually tweaked acoustic feature weights, enabling prosody for unit selection, and significantly reducing pitch discontinuities in the resulting synthesis output.
To wit,

$ for j in lib/*.jar; do echo $j; unzip -c $j *_ac.txt | tail -n5; done
lib/voice-bits3-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
500 linear | unit_duration
50 linear | unit_logf0
50 linear | unit_logf0delta

lib/voice-dfki-obadiah-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1500 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-pavoque-neutral-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-pavoque-styles-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-poppy-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-prudence-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
1000 linear | unit_duration
100 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-dfki-spike-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
2000 linear | unit_duration
50 linear | unit_logf0
0 linear | unit_logf0delta

lib/voice-voxforge-ru-nsh-5.0-SNAPSHOT.jar
ContinuousFeatureProcessors
0 linear | unit_duration
0 linear | unit_logf0
0 linear | unit_logf0delta

Accordingly, I believe that enabling the acoustic features, by at least setting reasonable default weights for duration and log F0, would be an improvement. This is especially true when no weights tuning is applied, e.g., in unsupervised voicebuilding workflows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant