Ingmar Steiner
20–31.03.2017
Hello world
MaryXML
<?xml version="1.0" encoding="UTF-8"?>
<maryxml xmlns="http://mary.dfki.de/2002/MaryXML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="0.5" xml:lang="en-US">
<p>
<s>
<phrase>
<t accent="L+H*" g2p_method="lexicon" ph="h @ - ' l @U" pos="UH">
Hello
<syllable ph="h @">
<ph p="h"/>
<ph p="@"/>
</syllable>
<syllable accent="L+H*" ph="l @U" stress="1">
<ph p="l"/>
<ph p="@U"/>
</syllable>
</t>
<t accent="!H*" g2p_method="lexicon" ph="' w r= l d" pos="NN">
world
<syllable accent="!H*" ph="w r= l d" stress="1">
<ph p="w"/>
<ph p="r="/>
<ph p="l"/>
<ph p="d"/>
</syllable>
</t>
<boundary breakindex="5" tone="L-L%"/>
</phrase>
</s>
</p>
</maryxml>
phone | pos_in_syl | accented | ph_cplace |
---|---|---|---|
h | 0 | 0 | g |
@ | 1 | 0 | 0 |
l | 0 | 1 | a |
@U | 1 | 1 | 0 |
w | 0 | 1 | l |
r= | 1 | 1 | 0 |
l | 2 | 1 | a |
d | 3 | 1 | a |
_ | 0 | 0 | 0 |
(small selection)
Target feature vectors used to generate/retrieve audio:
Compute features vectors from text
then assign them to provided data.
Clone voicebuilding project
git clone https://github.com/marytts/voice-cmu-slt -b v5.2
cd voice-cmu-slt
./gradlew legacyInit
./gradlew build
Run an ad-hoc MaryTTS server with this voice
./gradlew run
data
dependencyUsing Praat
./gradlew legacyPraatPitchmarker
wav/*.wav
pm/*.pm
Using ch_track
from EST
./gradlew legacyMCEPMaker
wav/*.wav
mcep/*.mcep
Predict phone sequence from text using MaryTTS
./gradlew generateAllophones
text/*.txt
prompt_allophones/*.xml
./gradlew legacyTranscriptionAligner
prompt_allophones/*.xml
, lab/*.lab
allophones/*.xml
Compute and assign feature vector to each unit using MaryTTS
./gradlew legacyPhoneUnitFeatureComputer legacyHalfPhoneUnitFeatureComputer
allophones/*.xml
, mary/features.txt
phonefeatures/*.pfeats
, halfphonefeatures/*.hpfeats
Compile “timeline” files for audio, utterances, and acoustic features
./gradlew legacyWaveTimelineMaker legacyBasenameTimelineMaker legacyMCepTimelineMaker
wav/*.wav
, pm/*.pm
, mcep/*.mcep
mary/timeline_waveforms.mry
, mary/timeline_basenames.mry
, mary/timeline_mcep.mry
These contain the actual data from the wav
and mcep
files, in pitch-synchronous “datagram” packets.
Phone-level and halfphone-level unit and features files
./gradlew legacyPhoneUnitfileWriter legacyHalfPhoneUnitfileWriter legacyPhoneFeatureFileWriter legacyHalfPhoneFeatureFileWriter
pm/*.pm
, phonelab/*.lab
, phonefeatures/*.pfeats
, halfphonelab/*.hplab
mary/phoneUnits.mry
, mary/halfphoneUnits.mry
, mary/phoneFeatures.mry
, mary/phoneUnitFeatureDefinition.txt
, mary/halfphoneFeatures.mry
, mary/halfphoneUnitFeatureDefinition.txt
Using wagon
from EST
./gradlew legacyDurationCARTTrainer legacyF0CARTTrainer
mary/phoneUnits.mry
, mary/phoneFeatures.mry
, mary/timeline_waveforms.mry
mary/dur.tree
, mary/f0.left.tree
, mary/f0.mid.tree
, mary/f0.right.tree
Ready for deployment in MaryTTS installation
./gradlew assemble
mary/cart.mry
, featureSequence.txt
, mary/dur.tree
, mary/f0.left.tree
, mary/f0.mid.tree
, mary/f0.right.tree
, mary/halfphoneFeatures_ac.mry
, mary/joinCostFeatures.mry
, mary/joinCostWeights.txt
, mary/halfphoneUnits.mry
, mary/timeline_basenames.mry
, mary/timeline_waveforms.mry
my_voice.zip
, my_voice-component.xml
Each group will need:
Phonetically balanced, e.g.,
Presentation laptop with HDMI output
DAW (Cubase/ProTools) records multiple channels:
Forced alignment with one of
Don’t forget to analyze and check for errors!
Use Git.
But don’t store big binary files (such as audio) in Git!
Use solutions such as
git-lfs
git-annex
Dockerfile
Run
docker build \
--build-arg HTKUSER=***** \
--build-arg HTKPASSWORD=***** \
-t marytts-builder-hsmm .
/bin/voiceimport.sh
script from within your voicebuilding project’s build
directory
db.marybase
property to /marytts
; click “Save”Finally, run
docker run -v $PWD:$PWD -t marytts-builder-hsmm bash -c \
"cd $PWD; \
/marytts/target/marytts-builder-5.2/bin/voiceimport.sh \
HMMVoiceDataPreparation \
HMMVoiceConfigure \
HMMVoiceMakeData \
HMMVoiceMakeVoice"
This will take a long time. Follow the progress by tailing the hts/log-SOMETIMESTAMP
log file.
Good luck!
Finally, run
docker run -v $PWD:$PWD -t marytts-builder-hsmm bash -c \
"cd $PWD; \
/marytts/target/marytts-builder-5.2/bin/voiceimport.sh \
HMMVoiceCompiler"
This will collect all resources and generate code unter the build/mary/voice-YOUR_VOICE_NAME
directory. It will also attempt to run Maven and fail (don’t worry about that).
Copy this buildscript into the generated Maven project directory, then run gradle build
or gradle run
.