Skip to content

Adding New Language and IPA Symbols in Model Training #22

@lpscr

Description

@lpscr

Hi, thank you so much for the amazing repo—it's really very cool!

I'm trying to add a new language, but I encountered an issue with IPA symbols. Specifically, 5 letters are missing. I checked symbols.py and found some symbols were unused. Here’s what I used:

IPA_letters = 'NQabdefghijklmnopstuvwxyzɑæʃʑçɯɪɔɛɹðəɫɥɸʊɾʒθβŋɦ⁼ʰ^#*=ˈˌ→↓↑ '
`
I used phonemizer to generate phonemes for my language and replaced the missing symbols with:
here in my case

"r" : "ɾ",
"ɣ" : "g",
"ɲ" : "h",
"c" : "ɔ",
"ɡ" : "g",
"ʎ" : "ɦ"

this correct method ? or
how i can add more symbols like 5 i need ? when i did this i get error

After ensuring all symbols were accounted for, I trained the model using a pre-trained checkpoint_0.pt model for fine-tuning over 40 hours. The model can produce speech, so I assume the symbol replacement worked i guest. However, the timing is off—the speech sounds bad, with incorrect word speed, though the sound quality is okay. not noise or robot

I used the pre-trained model for fine-tuning by copying it to the checkpoint folder and starting the training. I haven't trained the model from scratch yet, as I think it would take too long.

need change something in config ?
like lear rate?

here results of the train

about 4 hours train
Rank 0, Epoch 24, Loss 2.3834068775177

image
Do I need more steps to fix the timing issue?

I would really appreciate any help with this!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions