Releases · sipeter/CloneTTS

28 Apr 00:56

sipeter

v0.6.3

409ceb5

CloneTTS v0.6.3 Latest

Latest

Bug fix release following v0.6.2.

Highlights

Fixed BGM volume not applying when reusing a playback session for the same book.
Fixed accessibility labels for the main screen search bar, close-search button, and settings button — now properly labelled in Chinese.
Fixed voice export to support selective per-voice export via a selection dialog, instead of always exporting all voices.
Fixed OpenAI-compatible TTS endpoint (/v1/audio/speech) Chinese garble when callers send percent-encoded text or UTF-8 bytes mis-decoded as Latin-1.
Fixed SettingsActivity triggering deprecated setStatusBarColor / setNavigationBarColor warnings on Android 15; replaced system translucent theme with a custom NoActionBar theme, unified slide-right transition animation across Android 13 and 15.

Notes

v0.6.3 is a focused bug-fix release. No new major features.
Especially relevant for users on Android 15, TalkBack users, and callers using the OpenAI-compatible API with Chinese text.

v0.6.3 是承接 v0.6.2 的后续 Bug 修复版本。

本次重点

修复 BGM 背景音乐在同一本书复用会话时音量设置无效的问题。
修复主界面搜索栏、关闭搜索、设置按钮无障碍标签为英文或缺失的问题，补齐中文标签。
修复音色导出只能全量导出的问题，改为通过弹窗支持勾选指定音色导出。
修复 OpenAI TTS 接口接收中文时出现乱码，兼容百分号编码和 UTF-8/Latin-1 误读两种情况。
修复设置页在 Android 15 上触发弃用 API 告警及黑边问题，统一 Android 13/15 的右进右出过渡动画。

版本定位

以 Bug 修复为主，不引入新功能。
适合 Android 15 设备用户、TalkBack 用户，以及使用 OpenAI 兼容接口发送中文的调用方。

Assets 3

23 Apr 14:00

sipeter

v0.6.2

409ceb5

CloneTTS v0.6.2

Accessibility and compatibility follow-up release after v0.6.1.

Highlights

Improved TalkBack accessibility on the main screen with direct navigation to Voices, Voice Pool, Engine, and Test Console.
Restored easier speed and volume adjustment in accessibility mode with step-based controls.
Fixed group picker dialogs not scrolling when many groups are present during move and batch operations.
Updated the Settings screen for Android 15/16 edge-to-edge behavior and removed deprecated system bar APIs.

Notes

v0.6.2 focuses on accessibility and display compatibility rather than new major features.
This release is especially relevant for users on TalkBack and newer Android versions.

中文说明

v0.6.2 是承接 v0.6.1 的一版可访问性与显示兼容性修复。

本次重点

优化无障碍模式下的主页面导航，现在可以更直接地访问大厅、音色大厅、引擎和测试控制台。
修复无障碍模式下语速和音量调节不方便的问题，恢复更适合 TalkBack 用户的步进式操作。
修复分组较多时，移动音色、批量操作等分组选择弹窗无法滚动的问题。
更新设置页的 Android 15/16 无边框显示适配，并移除已弃用的系统栏接口。

版本定位

这是一个以无障碍和兼容性为主的后续修复版本。
不引入新的大型功能，重点提升已有 0.6.x 界面的可用性。

Assets 3

23 Apr 13:43

sipeter

v0.6.1

409ceb5

CloneTTS v0.6.1

Hotfix release for the v0.6.0 group management regressions.

Highlights

Fixed the crash when creating a new voice group from the main screen.
Fixed grouped voice lists not refreshing immediately after delete / move / import operations.
Fixed append/override import jumping back to the first group instead of staying on the target group.
Added explicit Rename Group / Delete Group actions to the top-right menu for the current group.
New voices created from the current group now stay in that group.
Promoted Imported Voices to a protected system default group that cannot be deleted.

Notes

v0.6.1 is a stability hotfix release after v0.6.0.
No major new features were added in this release.
Focus: group creation, deletion, import behavior, and fallback group protection.

中文说明

v0.6.1 是面向 v0.6.0 分组管理回归问题的热修复版本。

本次重点

修复主页面新建分组时的闪退问题。
修复删除/移动/导入后，分组内音色列表不会立即刷新的问题。
修复追加导入、覆盖导入到指定分组后，界面仍跳回第一个分组的问题。
在右上角菜单中补充当前分组的“重命名分组 / 删除分组”入口。
在当前分组里新增音色时，保存后的音色会正确归属当前分组。
将“导入的音色”升级为不可删除的系统默认兜底分组。

版本定位

这是一个稳定性热修复版本。
不引入新的大型功能。
重点解决 v0.6.0 上线后被用户实际发现的分组问题。

Assets 3

23 Apr 13:28

sipeter

v0.6.0

409ceb5

CloneTTS v0.6.0

New Features

Voice Groups: Organize voices into custom groups with drag-to-reorder, batch move, and group-aware import/export.
Voice Pools: Attribute-based voice matching system (e.g. "male youth", "narrator") with 11 built-in templates and one-click sync from role tags.
Multi-Role Reading: Automatic role detection from <<Role（Attribute）>> tags, configurable parsing, fixed role bindings, narrator voice, dialogue lock, and role caching — all with an in-app config UI and test console.
Model Download Manager: Download Standard/Fast models directly in Settings with SHA256 verification, resume support, and reactive UI updates.
Ambience System (Experimental): BGM session manager for continuous background music during reading sessions; SFX marker playback ([sfx:event]); BGM/SFX management UI with 4 play modes.
Unified API Route: Single /api/legado/default import rule — no need to import hundreds of voice-specific rules into reading apps.
Settings Overhaul: Drawer-style navigation with fullscreen sub-pages, slide animations, and consolidated help center.

Improvements

Global voice search across all groups
API routes directly copyable in settings; dynamic port assignment
Group-scoped override import (other groups unaffected)

Bug Fixes

Fixed rule append import not working (v0.5.3 regression)

新增功能

音色分组管理：支持创建自定义分组，拖拽排序、批量移动、分组导入导出。
声音池系统：基于属性词（如"男青年"、"旁白"）自动匹配音色，内置 11 个默认模板，支持从角色标签一键同步。
多角色朗读：自动识别 <<角色（属性）>> 标签，支持自定义解析规则、固定角色绑定、旁白音色、对话锁、角色缓存，内置配置界面与测试控制台。
模型下载管理器：在设置页直接下载标准/极速模型，支持 SHA256 校验、断点续传、下载状态实时刷新。
氛围音系统（实验性）：BGM 会话管理器，支持阅读期间持续播放背景音乐；SFX 音效标记播放（[sfx:事件名]）；BGM/SFX 管理界面支持 4 种播放模式。
统一导入路由：新增 /api/legado/default 单条规则导入，阅读侧无需导入上百条音色规则。
设置页重构：左侧抽屉式导航，二级页面全屏展示，滑动动画，帮助中心整合。

改进优化

全局音色搜索（跨分组）
设置页中 API 路由一键复制，端口动态分配
覆盖导入按分组隔离，不影响其他分组

Bug 修复

修复规则追加导入失效（v0.5.3 回归问题）

Assets 3

20 Apr 10:50

sipeter

v0.5.5

6d581b1

CloneTTS v0.5.5

What's New

Improved FAST model: Updated FAST engine model with better overall listening quality
Voice alias fixes: Fixed alias loss during import; append import now auto-deduplicates
Cache refresh on model switch: Automatically clears audition cache when switching between STANDARD and FAST modes

更新内容

极速模型更新：更新极速模式模型，整体试听效果比上一版略好
音色别名修复：修复导入备份时别名丢失的问题，追加导入支持自动去重
缓存清除修复：切换标准/极速模型后自动清空试听缓存

Assets 3

18 Apr 13:26

sipeter

v0.5.4

6d581b1

CloneTTS v0.5.4

New Features

Dual-Engine Mode: Added a FAST engine that can be toggled with the Standard engine in the "Service & Engine" settings. Standard mode offers better quality, while FAST mode prioritizes speed.
Smart Audition Cache: Audition cache is automatically isolated per engine, preventing cache hits from previous engines. The Voice Library menu now includes a "Clear Cache" option for manual cache management.
First-Use Guide: A 2-step onboarding flow (Welcome → Performance Test) automatically displays on first install, helping users quickly understand device performance.
Preset Audition Texts: The audition panel now includes 5 preset text scenarios (Daily Greeting, News Broadcast, Novel Reading, Poetry & Prose, Short Phrases), with one-tap switching and custom input support.
Telegram Community: Added Telegram group link (t.me/CloneTTS) to the About page.

UI & Interaction Improvements

Engine Description Refinement: Updated descriptions for Standard and FAST modes to help users understand the differences.
Enhanced Logging: Engine model (Standard/FAST) now appears in logs for easier troubleshooting.
Audio Preprocessing Defaults: When creating new voices, silence trimming and noise removal are disabled by default to prevent unintended reference audio modifications.
Performance Test Layout: Results layout reorganized with larger font for overall assessment, device info and engine mode now shown in reports for performance comparison across engines.

Accessibility

TalkBack Screenreader Fix: Expand/collapse arrows now dynamically announce "expand" or "collapse" based on state, eliminating duplicate announcements.

新增功能

双引擎模式：新增极速引擎，可在「服务与引擎」页面切换标准/极速模式。标准模式音质更好，极速模式速度更快。
试音缓存管理：切换引擎后自动使用独立缓存，不再继承旧引擎生成结果。音色管理库菜单新增「清除缓存」功能，支持手动清除试音缓存。
首次使用引导：安装后自动显示 2 步引导流程（欢迎→性能测试），帮助用户快速了解设备性能。
预设试听文本：试听面板新增 5 个预设文本场景（日常问候、新闻播报、小说朗读、诗词散文、短句测试），一键切换，另支持自定义输入。
Telegram 社区：关于页面新增 Telegram 群组链接（t.me/CloneTTS）。

交互与界面优化

引擎描述优化：更新标准模式与极速模式的描述文案，帮助用户理解两种模式的差异。
日志增强：日志中显示当前使用的引擎模型（标准/极速），方便排查问题。
音频预处理默认值调整：新建音色时静音裁剪和底噪消除默认关闭，避免意外修改参考音频。
性能测试优化：结果布局重新编排，综合评价放大字体醒目展示，设备信息和报告中显示引擎模式，方便对比不同引擎的性能差异。

无障碍

TalkBack 读屏修复：展开/收起箭头现在根据当前状态动态朗读"展开"或"收起"，不再重复读两次。

Assets 3

14 Apr 10:42

sipeter

v0.5.3

0ecea9e

CloneTTS v0.5.3

This release includes all cumulative updates and fixes from the v0.5.3 beta series (beta1 – beta2).

New Features

Built-in Reference Audio: Automatically imports 3 preset TTS voices on first install — experience voice cloning instantly without recording.
Short Text Pronunciation Fix: Automatically appends a period to text fragments not ending with sentence-final punctuation, reducing abrupt cutoffs at the end of short phrases.
Multi-select ZIP Import: The file picker for voice import now supports multi-select mode, allowing batch import of multiple ZIP files at once.
Selective Overwrite Import: Overwrite import now displays a voice list for the user to choose from, instead of blindly overwriting everything.
Import File Validation: Format validation added when creating a new voice (audio formats only); reference audio capped at 30 seconds. ZIP imports are validated for content structure, with clear error messages for invalid files.

UI & Interaction Improvements

Drag-to-Reorder Overhaul: Drag-and-drop sorting for voice lists and replacement rule lists has been fully rewritten, completely eliminating edge-case jitter.
Smoother Scrolling: Fixed an issue where the entire Voice Tab scrolled in unison.
Benchmark Button Optimization: Engine initialization moved to a background thread — first tap no longer causes a UI freeze.

Bug Fixes & Core Changes

System TTS Playback Artifact Fix: Removed redundant runtime RNNoise denoising, eliminating the audio glitch at the beginning of playback caused by GRU initialization artifacts.
Dual-Engine Consolidation: Voice preview and System TTS / Benchmark now share a single engine instance, saving ~150 MB of memory.
"Restore Original Audio" Button Fix: Added applied-state tracking so the restore button only appears after audio has actually been processed.
Overwrite Import Fix: Fixed an issue where importing multiple files via overwrite would overwrite each other.

该版本包含了 v0.5.3 测试系列（beta1 - beta2）的累计更新内容与修复。

新增功能

内置参考音频：首次安装自动导入 3 个 TTS 预设音色，无需录音即可立即体验语音克隆效果。
短文本发音优化：对未以句末标点结尾的文本片段，送入模型前自动追加句号，改善短句末尾截断或发音急促的问题。
多选 ZIP 导入：音色导入文件选择器支持多选模式，一次可选多个 ZIP 文件批量导入。
覆盖导入支持选择性导入：覆盖导入现在也会先展示音色列表供用户选择，而非直接全量覆盖。
导入文件格式校验：新建音色时增加格式校验（仅允许音频格式），参考音频时长上限 30 秒；ZIP 包导入时校验内容结构，无效文件直接提示。

交互与界面优化

拖拽排序重构：音色列表和替换规则列表的拖拽排序全面重构，彻底消除了边缘抖动问题。
滚动流畅度提升：修复 Voice Tab 整页联动滚动问题。
性能测试按钮优化：引擎初始化移入后台线程，首次点击不再卡顿。

Bug 与核心修复

系统 TTS 朗读开头异声修复：移除运行时重复执行的 RNNoise 降噪处理，消除 GRU 初始化伪影导致的开头杂音。
双引擎架构合并：音色试听与系统 TTS/性能测试共享同一引擎实例，节省约 150MB 内存占用。
「恢复原始音频」按钮逻辑修复：新增已应用状态追踪，仅在音频实际经过加工后才显示恢复按钮。
覆盖导入修复：修复多文件覆盖导入互相覆盖的问题。

Assets 4

Releases: sipeter/CloneTTS

CloneTTS v0.6.3

Highlights

Notes

本次重点

版本定位

Uh oh!

CloneTTS v0.6.2

Highlights

Notes

中文说明

本次重点

版本定位

Uh oh!

CloneTTS v0.6.1

Highlights

Notes

中文说明

本次重点

版本定位

Uh oh!

CloneTTS v0.6.0

New Features

Improvements

Bug Fixes

新增功能

改进优化

Bug 修复

Uh oh!

CloneTTS v0.5.5

What's New

更新内容

Uh oh!

CloneTTS v0.5.4

New Features

UI & Interaction Improvements

Accessibility

新增功能

交互与界面优化

无障碍

Uh oh!

CloneTTS v0.5.3

New Features

UI & Interaction Improvements

Bug Fixes & Core Changes

新增功能

交互与界面优化

Bug 与核心修复

Uh oh!