Releases: sipeter/CloneTTS
Releases · sipeter/CloneTTS
CloneTTS v0.6.3
Bug fix release following v0.6.2.
Highlights
- Fixed BGM volume not applying when reusing a playback session for the same book.
- Fixed accessibility labels for the main screen search bar, close-search button, and settings button — now properly labelled in Chinese.
- Fixed voice export to support selective per-voice export via a selection dialog, instead of always exporting all voices.
- Fixed OpenAI-compatible TTS endpoint (
/v1/audio/speech) Chinese garble when callers send percent-encoded text or UTF-8 bytes mis-decoded as Latin-1. - Fixed
SettingsActivitytriggering deprecatedsetStatusBarColor/setNavigationBarColorwarnings on Android 15; replaced system translucent theme with a customNoActionBartheme, unified slide-right transition animation across Android 13 and 15.
Notes
- v0.6.3 is a focused bug-fix release. No new major features.
- Especially relevant for users on Android 15, TalkBack users, and callers using the OpenAI-compatible API with Chinese text.
v0.6.3 是承接 v0.6.2 的后续 Bug 修复版本。
本次重点
- 修复 BGM 背景音乐在同一本书复用会话时音量设置无效的问题。
- 修复主界面搜索栏、关闭搜索、设置按钮无障碍标签为英文或缺失的问题,补齐中文标签。
- 修复音色导出只能全量导出的问题,改为通过弹窗支持勾选指定音色导出。
- 修复 OpenAI TTS 接口接收中文时出现乱码,兼容百分号编码和 UTF-8/Latin-1 误读两种情况。
- 修复设置页在 Android 15 上触发弃用 API 告警及黑边问题,统一 Android 13/15 的右进右出过渡动画。
版本定位
- 以 Bug 修复为主,不引入新功能。
- 适合 Android 15 设备用户、TalkBack 用户,以及使用 OpenAI 兼容接口发送中文的调用方。
CloneTTS v0.6.2
Accessibility and compatibility follow-up release after v0.6.1.
Highlights
- Improved TalkBack accessibility on the main screen with direct navigation to Voices, Voice Pool, Engine, and Test Console.
- Restored easier speed and volume adjustment in accessibility mode with step-based controls.
- Fixed group picker dialogs not scrolling when many groups are present during move and batch operations.
- Updated the Settings screen for Android 15/16 edge-to-edge behavior and removed deprecated system bar APIs.
Notes
- v0.6.2 focuses on accessibility and display compatibility rather than new major features.
- This release is especially relevant for users on TalkBack and newer Android versions.
中文说明
v0.6.2 是承接 v0.6.1 的一版可访问性与显示兼容性修复。
本次重点
- 优化无障碍模式下的主页面导航,现在可以更直接地访问大厅、音色大厅、引擎和测试控制台。
- 修复无障碍模式下语速和音量调节不方便的问题,恢复更适合 TalkBack 用户的步进式操作。
- 修复分组较多时,移动音色、批量操作等分组选择弹窗无法滚动的问题。
- 更新设置页的 Android 15/16 无边框显示适配,并移除已弃用的系统栏接口。
版本定位
- 这是一个以无障碍和兼容性为主的后续修复版本。
- 不引入新的大型功能,重点提升已有 0.6.x 界面的可用性。
CloneTTS v0.6.1
Hotfix release for the v0.6.0 group management regressions.
Highlights
- Fixed the crash when creating a new voice group from the main screen.
- Fixed grouped voice lists not refreshing immediately after delete / move / import operations.
- Fixed append/override import jumping back to the first group instead of staying on the target group.
- Added explicit Rename Group / Delete Group actions to the top-right menu for the current group.
- New voices created from the current group now stay in that group.
- Promoted
Imported Voicesto a protected system default group that cannot be deleted.
Notes
- v0.6.1 is a stability hotfix release after v0.6.0.
- No major new features were added in this release.
- Focus: group creation, deletion, import behavior, and fallback group protection.
中文说明
v0.6.1 是面向 v0.6.0 分组管理回归问题的热修复版本。
本次重点
- 修复主页面新建分组时的闪退问题。
- 修复删除/移动/导入后,分组内音色列表不会立即刷新的问题。
- 修复追加导入、覆盖导入到指定分组后,界面仍跳回第一个分组的问题。
- 在右上角菜单中补充当前分组的“重命名分组 / 删除分组”入口。
- 在当前分组里新增音色时,保存后的音色会正确归属当前分组。
- 将“导入的音色”升级为不可删除的系统默认兜底分组。
版本定位
- 这是一个稳定性热修复版本。
- 不引入新的大型功能。
- 重点解决 v0.6.0 上线后被用户实际发现的分组问题。
CloneTTS v0.6.0
New Features
- Voice Groups: Organize voices into custom groups with drag-to-reorder, batch move, and group-aware import/export.
- Voice Pools: Attribute-based voice matching system (e.g. "male youth", "narrator") with 11 built-in templates and one-click sync from role tags.
- Multi-Role Reading: Automatic role detection from
<<Role(Attribute)>>tags, configurable parsing, fixed role bindings, narrator voice, dialogue lock, and role caching — all with an in-app config UI and test console. - Model Download Manager: Download Standard/Fast models directly in Settings with SHA256 verification, resume support, and reactive UI updates.
- Ambience System (Experimental): BGM session manager for continuous background music during reading sessions; SFX marker playback (
[sfx:event]); BGM/SFX management UI with 4 play modes. - Unified API Route: Single
/api/legado/defaultimport rule — no need to import hundreds of voice-specific rules into reading apps. - Settings Overhaul: Drawer-style navigation with fullscreen sub-pages, slide animations, and consolidated help center.
Improvements
- Global voice search across all groups
- API routes directly copyable in settings; dynamic port assignment
- Group-scoped override import (other groups unaffected)
Bug Fixes
- Fixed rule append import not working (v0.5.3 regression)
新增功能
- 音色分组管理:支持创建自定义分组,拖拽排序、批量移动、分组导入导出。
- 声音池系统:基于属性词(如"男青年"、"旁白")自动匹配音色,内置 11 个默认模板,支持从角色标签一键同步。
- 多角色朗读:自动识别
<<角色(属性)>>标签,支持自定义解析规则、固定角色绑定、旁白音色、对话锁、角色缓存,内置配置界面与测试控制台。 - 模型下载管理器:在设置页直接下载标准/极速模型,支持 SHA256 校验、断点续传、下载状态实时刷新。
- 氛围音系统(实验性):BGM 会话管理器,支持阅读期间持续播放背景音乐;SFX 音效标记播放(
[sfx:事件名]);BGM/SFX 管理界面支持 4 种播放模式。 - 统一导入路由:新增
/api/legado/default单条规则导入,阅读侧无需导入上百条音色规则。 - 设置页重构:左侧抽屉式导航,二级页面全屏展示,滑动动画,帮助中心整合。
改进优化
- 全局音色搜索(跨分组)
- 设置页中 API 路由一键复制,端口动态分配
- 覆盖导入按分组隔离,不影响其他分组
Bug 修复
- 修复规则追加导入失效(v0.5.3 回归问题)
CloneTTS v0.5.5
What's New
- Improved FAST model: Updated FAST engine model with better overall listening quality
- Voice alias fixes: Fixed alias loss during import; append import now auto-deduplicates
- Cache refresh on model switch: Automatically clears audition cache when switching between STANDARD and FAST modes
更新内容
- 极速模型更新:更新极速模式模型,整体试听效果比上一版略好
- 音色别名修复:修复导入备份时别名丢失的问题,追加导入支持自动去重
- 缓存清除修复:切换标准/极速模型后自动清空试听缓存
CloneTTS v0.5.4
New Features
- Dual-Engine Mode: Added a FAST engine that can be toggled with the Standard engine in the "Service & Engine" settings. Standard mode offers better quality, while FAST mode prioritizes speed.
- Smart Audition Cache: Audition cache is automatically isolated per engine, preventing cache hits from previous engines. The Voice Library menu now includes a "Clear Cache" option for manual cache management.
- First-Use Guide: A 2-step onboarding flow (Welcome → Performance Test) automatically displays on first install, helping users quickly understand device performance.
- Preset Audition Texts: The audition panel now includes 5 preset text scenarios (Daily Greeting, News Broadcast, Novel Reading, Poetry & Prose, Short Phrases), with one-tap switching and custom input support.
- Telegram Community: Added Telegram group link (t.me/CloneTTS) to the About page.
UI & Interaction Improvements
- Engine Description Refinement: Updated descriptions for Standard and FAST modes to help users understand the differences.
- Enhanced Logging: Engine model (Standard/FAST) now appears in logs for easier troubleshooting.
- Audio Preprocessing Defaults: When creating new voices, silence trimming and noise removal are disabled by default to prevent unintended reference audio modifications.
- Performance Test Layout: Results layout reorganized with larger font for overall assessment, device info and engine mode now shown in reports for performance comparison across engines.
Accessibility
- TalkBack Screenreader Fix: Expand/collapse arrows now dynamically announce "expand" or "collapse" based on state, eliminating duplicate announcements.
新增功能
- 双引擎模式:新增极速引擎,可在「服务与引擎」页面切换标准/极速模式。标准模式音质更好,极速模式速度更快。
- 试音缓存管理:切换引擎后自动使用独立缓存,不再继承旧引擎生成结果。音色管理库菜单新增「清除缓存」功能,支持手动清除试音缓存。
- 首次使用引导:安装后自动显示 2 步引导流程(欢迎→性能测试),帮助用户快速了解设备性能。
- 预设试听文本:试听面板新增 5 个预设文本场景(日常问候、新闻播报、小说朗读、诗词散文、短句测试),一键切换,另支持自定义输入。
- Telegram 社区:关于页面新增 Telegram 群组链接(t.me/CloneTTS)。
交互与界面优化
- 引擎描述优化:更新标准模式与极速模式的描述文案,帮助用户理解两种模式的差异。
- 日志增强:日志中显示当前使用的引擎模型(标准/极速),方便排查问题。
- 音频预处理默认值调整:新建音色时静音裁剪和底噪消除默认关闭,避免意外修改参考音频。
- 性能测试优化:结果布局重新编排,综合评价放大字体醒目展示,设备信息和报告中显示引擎模式,方便对比不同引擎的性能差异。
无障碍
- TalkBack 读屏修复:展开/收起箭头现在根据当前状态动态朗读"展开"或"收起",不再重复读两次。
CloneTTS v0.5.3
This release includes all cumulative updates and fixes from the v0.5.3 beta series (beta1 – beta2).
New Features
- Built-in Reference Audio: Automatically imports 3 preset TTS voices on first install — experience voice cloning instantly without recording.
- Short Text Pronunciation Fix: Automatically appends a period to text fragments not ending with sentence-final punctuation, reducing abrupt cutoffs at the end of short phrases.
- Multi-select ZIP Import: The file picker for voice import now supports multi-select mode, allowing batch import of multiple ZIP files at once.
- Selective Overwrite Import: Overwrite import now displays a voice list for the user to choose from, instead of blindly overwriting everything.
- Import File Validation: Format validation added when creating a new voice (audio formats only); reference audio capped at 30 seconds. ZIP imports are validated for content structure, with clear error messages for invalid files.
UI & Interaction Improvements
- Drag-to-Reorder Overhaul: Drag-and-drop sorting for voice lists and replacement rule lists has been fully rewritten, completely eliminating edge-case jitter.
- Smoother Scrolling: Fixed an issue where the entire Voice Tab scrolled in unison.
- Benchmark Button Optimization: Engine initialization moved to a background thread — first tap no longer causes a UI freeze.
Bug Fixes & Core Changes
- System TTS Playback Artifact Fix: Removed redundant runtime RNNoise denoising, eliminating the audio glitch at the beginning of playback caused by GRU initialization artifacts.
- Dual-Engine Consolidation: Voice preview and System TTS / Benchmark now share a single engine instance, saving ~150 MB of memory.
- "Restore Original Audio" Button Fix: Added applied-state tracking so the restore button only appears after audio has actually been processed.
- Overwrite Import Fix: Fixed an issue where importing multiple files via overwrite would overwrite each other.
该版本包含了 v0.5.3 测试系列(beta1 - beta2)的累计更新内容与修复。
新增功能
- 内置参考音频:首次安装自动导入 3 个 TTS 预设音色,无需录音即可立即体验语音克隆效果。
- 短文本发音优化:对未以句末标点结尾的文本片段,送入模型前自动追加句号,改善短句末尾截断或发音急促的问题。
- 多选 ZIP 导入:音色导入文件选择器支持多选模式,一次可选多个 ZIP 文件批量导入。
- 覆盖导入支持选择性导入:覆盖导入现在也会先展示音色列表供用户选择,而非直接全量覆盖。
- 导入文件格式校验:新建音色时增加格式校验(仅允许音频格式),参考音频时长上限 30 秒;ZIP 包导入时校验内容结构,无效文件直接提示。
交互与界面优化
- 拖拽排序重构:音色列表和替换规则列表的拖拽排序全面重构,彻底消除了边缘抖动问题。
- 滚动流畅度提升:修复 Voice Tab 整页联动滚动问题。
- 性能测试按钮优化:引擎初始化移入后台线程,首次点击不再卡顿。
Bug 与核心修复
- 系统 TTS 朗读开头异声修复:移除运行时重复执行的 RNNoise 降噪处理,消除 GRU 初始化伪影导致的开头杂音。
- 双引擎架构合并:音色试听与系统 TTS/性能测试共享同一引擎实例,节省约 150MB 内存占用。
- 「恢复原始音频」按钮逻辑修复:新增已应用状态追踪,仅在音频实际经过加工后才显示恢复按钮。
- 覆盖导入修复:修复多文件覆盖导入互相覆盖的问题。