Welcome to Role-Playing-TTS!
output.mp4
First, you need to clone this project and download the required model files.
a. Clone this project:
git clone https://github.com/Riddae/Role-Playing-TTS.git
cd Role-Playing-TTSb. Download dependencies: This project depends on MMAudio and CosyVoice. Please download them as follows.
# Download MMAudio library
git clone https://github.com/hkchengrex/MMAudio.git
cd TTS # Enter the TTS directory
# Download CosyVoice library
git clone https://github.com/FunAudioLLM/CosyVoice.git
cd CosyVoice # Enter the CosyVoice directoryc. Download model files:
python download.pyTo ensure the system runs stably, please configure your Python environment according to the following steps.
# Create a new environment named 'audio' with Python 3.10
conda create -n audio -y python=3.10
# Activate the newly created environment
conda activate audio
pip install -r requirements.txt
cd /TTS/CosyVoice/pretrained_models/CosyVoice-ttsfrd
unzip resource.zip -d .
pip install ttsfrd_dependency-0.1-py3-none-any.whl
pip install ttsfrd-0.4.2-cp310-cp310-linux_x86_64.whlNow you can run Role-Playing-TTS in the following ways. We provide two main modes: an API service and a command-line pipeline.
a. Start the API Service: First, start the backend API service so that subsequent programs can interact with it.
# Execute in the root directory of Role-Playing-TTS
python services.pySet environment variables for using API services GPT-4 API
Please modify the OPENAI_PROXY and OPENAI_KEY parameter in pipeline.py
OPENAI_PROXY = "your_openai_proxy"
OPENAI_KEY = "your_openai_key"
b. Use the Command-Line Pipeline for TTS Synthesis
You can generate speech directly from the command line. Here is an example of generating an audio drama about "Journey to the West".
# Make sure you have activated the 'audio' environment
# Make sure services.py is running
python pipeline.py --text "请你创造一个唐僧师徒四人西天取经的有声短剧" --output_path output--text: The content you want to synthesize (characters + theme).
--output_path: The generated audio file will be saved to this path.
--step1: The dialogue content to be generated (optional). Provide the character dialogue, and GPT will analyze it to add sound effects and BGM.
--step2: The complete version with sound effects, BGM, and dialogue (optional). Provide the sound effects, BGM, and dialogue, and the model will generate the audio directly.
For the format of step1 and step2, please refer to the files in the output directory.
c. Launch the Web UI (Recommended): To provide a more user-friendly experience, we offer a web interface.
# Make sure you have activated the 'audio' environment
# Make sure services.py is running
python web.pyWe only provide voices for a dozen or so characters, such as Sun Wukong, Zhu Bajie, Tang Seng, Nezha, Taiyi Zhenren, etc. For details, please see char_to_voice_map.json. You can add more voices by following the format in char_to_voice_map.json if needed.
We have borrowed and used code from the following projects, and we would like to express our gratitude: CosyVoice WavJourney MMAudio
The content provided above is for academic purposes only, intended to demonstrate technical capabilities. Some examples are from the internet. If any content infringes on your rights, please contact us for removal. We are not responsible for any audio generated using this model. Please ensure it is not used for any illegal purposes.