Skip to content

Releases: sipeter/CloneTTS

CloneTTS v0.6.3

28 Apr 00:56

Choose a tag to compare

Bug fix release following v0.6.2.

Highlights

  • Fixed BGM volume not applying when reusing a playback session for the same book.
  • Fixed accessibility labels for the main screen search bar, close-search button, and settings button — now properly labelled in Chinese.
  • Fixed voice export to support selective per-voice export via a selection dialog, instead of always exporting all voices.
  • Fixed OpenAI-compatible TTS endpoint (/v1/audio/speech) Chinese garble when callers send percent-encoded text or UTF-8 bytes mis-decoded as Latin-1.
  • Fixed SettingsActivity triggering deprecated setStatusBarColor / setNavigationBarColor warnings on Android 15; replaced system translucent theme with a custom NoActionBar theme, unified slide-right transition animation across Android 13 and 15.

Notes

  • v0.6.3 is a focused bug-fix release. No new major features.
  • Especially relevant for users on Android 15, TalkBack users, and callers using the OpenAI-compatible API with Chinese text.

v0.6.3 是承接 v0.6.2 的后续 Bug 修复版本。

本次重点

  • 修复 BGM 背景音乐在同一本书复用会话时音量设置无效的问题。
  • 修复主界面搜索栏、关闭搜索、设置按钮无障碍标签为英文或缺失的问题,补齐中文标签。
  • 修复音色导出只能全量导出的问题,改为通过弹窗支持勾选指定音色导出。
  • 修复 OpenAI TTS 接口接收中文时出现乱码,兼容百分号编码和 UTF-8/Latin-1 误读两种情况。
  • 修复设置页在 Android 15 上触发弃用 API 告警及黑边问题,统一 Android 13/15 的右进右出过渡动画。

版本定位

  • 以 Bug 修复为主,不引入新功能。
  • 适合 Android 15 设备用户、TalkBack 用户,以及使用 OpenAI 兼容接口发送中文的调用方。

CloneTTS v0.6.2

23 Apr 14:00

Choose a tag to compare

Accessibility and compatibility follow-up release after v0.6.1.

Highlights

  • Improved TalkBack accessibility on the main screen with direct navigation to Voices, Voice Pool, Engine, and Test Console.
  • Restored easier speed and volume adjustment in accessibility mode with step-based controls.
  • Fixed group picker dialogs not scrolling when many groups are present during move and batch operations.
  • Updated the Settings screen for Android 15/16 edge-to-edge behavior and removed deprecated system bar APIs.

Notes

  • v0.6.2 focuses on accessibility and display compatibility rather than new major features.
  • This release is especially relevant for users on TalkBack and newer Android versions.

中文说明

v0.6.2 是承接 v0.6.1 的一版可访问性与显示兼容性修复。

本次重点

  • 优化无障碍模式下的主页面导航,现在可以更直接地访问大厅、音色大厅、引擎和测试控制台。
  • 修复无障碍模式下语速和音量调节不方便的问题,恢复更适合 TalkBack 用户的步进式操作。
  • 修复分组较多时,移动音色、批量操作等分组选择弹窗无法滚动的问题。
  • 更新设置页的 Android 15/16 无边框显示适配,并移除已弃用的系统栏接口。

版本定位

  • 这是一个以无障碍和兼容性为主的后续修复版本。
  • 不引入新的大型功能,重点提升已有 0.6.x 界面的可用性。

CloneTTS v0.6.1

23 Apr 13:43

Choose a tag to compare

Hotfix release for the v0.6.0 group management regressions.

Highlights

  • Fixed the crash when creating a new voice group from the main screen.
  • Fixed grouped voice lists not refreshing immediately after delete / move / import operations.
  • Fixed append/override import jumping back to the first group instead of staying on the target group.
  • Added explicit Rename Group / Delete Group actions to the top-right menu for the current group.
  • New voices created from the current group now stay in that group.
  • Promoted Imported Voices to a protected system default group that cannot be deleted.

Notes

  • v0.6.1 is a stability hotfix release after v0.6.0.
  • No major new features were added in this release.
  • Focus: group creation, deletion, import behavior, and fallback group protection.

中文说明

v0.6.1 是面向 v0.6.0 分组管理回归问题的热修复版本。

本次重点

  • 修复主页面新建分组时的闪退问题。
  • 修复删除/移动/导入后,分组内音色列表不会立即刷新的问题。
  • 修复追加导入、覆盖导入到指定分组后,界面仍跳回第一个分组的问题。
  • 在右上角菜单中补充当前分组的“重命名分组 / 删除分组”入口。
  • 在当前分组里新增音色时,保存后的音色会正确归属当前分组。
  • 将“导入的音色”升级为不可删除的系统默认兜底分组。

版本定位

  • 这是一个稳定性热修复版本。
  • 不引入新的大型功能。
  • 重点解决 v0.6.0 上线后被用户实际发现的分组问题。

CloneTTS v0.6.0

23 Apr 13:28

Choose a tag to compare

New Features

  • Voice Groups: Organize voices into custom groups with drag-to-reorder, batch move, and group-aware import/export.
  • Voice Pools: Attribute-based voice matching system (e.g. "male youth", "narrator") with 11 built-in templates and one-click sync from role tags.
  • Multi-Role Reading: Automatic role detection from <<Role(Attribute)>> tags, configurable parsing, fixed role bindings, narrator voice, dialogue lock, and role caching — all with an in-app config UI and test console.
  • Model Download Manager: Download Standard/Fast models directly in Settings with SHA256 verification, resume support, and reactive UI updates.
  • Ambience System (Experimental): BGM session manager for continuous background music during reading sessions; SFX marker playback ([sfx:event]); BGM/SFX management UI with 4 play modes.
  • Unified API Route: Single /api/legado/default import rule — no need to import hundreds of voice-specific rules into reading apps.
  • Settings Overhaul: Drawer-style navigation with fullscreen sub-pages, slide animations, and consolidated help center.

Improvements

  • Global voice search across all groups
  • API routes directly copyable in settings; dynamic port assignment
  • Group-scoped override import (other groups unaffected)

Bug Fixes

  • Fixed rule append import not working (v0.5.3 regression)

新增功能

  • 音色分组管理:支持创建自定义分组,拖拽排序、批量移动、分组导入导出。
  • 声音池系统:基于属性词(如"男青年"、"旁白")自动匹配音色,内置 11 个默认模板,支持从角色标签一键同步。
  • 多角色朗读:自动识别 <<角色(属性)>> 标签,支持自定义解析规则、固定角色绑定、旁白音色、对话锁、角色缓存,内置配置界面与测试控制台。
  • 模型下载管理器:在设置页直接下载标准/极速模型,支持 SHA256 校验、断点续传、下载状态实时刷新。
  • 氛围音系统(实验性):BGM 会话管理器,支持阅读期间持续播放背景音乐;SFX 音效标记播放([sfx:事件名]);BGM/SFX 管理界面支持 4 种播放模式。
  • 统一导入路由:新增 /api/legado/default 单条规则导入,阅读侧无需导入上百条音色规则。
  • 设置页重构:左侧抽屉式导航,二级页面全屏展示,滑动动画,帮助中心整合。

改进优化

  • 全局音色搜索(跨分组)
  • 设置页中 API 路由一键复制,端口动态分配
  • 覆盖导入按分组隔离,不影响其他分组

Bug 修复

  • 修复规则追加导入失效(v0.5.3 回归问题)

CloneTTS v0.5.5

20 Apr 10:50

Choose a tag to compare

What's New

  • Improved FAST model: Updated FAST engine model with better overall listening quality
  • Voice alias fixes: Fixed alias loss during import; append import now auto-deduplicates
  • Cache refresh on model switch: Automatically clears audition cache when switching between STANDARD and FAST modes

更新内容

  • 极速模型更新:更新极速模式模型,整体试听效果比上一版略好
  • 音色别名修复:修复导入备份时别名丢失的问题,追加导入支持自动去重
  • 缓存清除修复:切换标准/极速模型后自动清空试听缓存

CloneTTS v0.5.4

18 Apr 13:26

Choose a tag to compare

New Features

  • Dual-Engine Mode: Added a FAST engine that can be toggled with the Standard engine in the "Service & Engine" settings. Standard mode offers better quality, while FAST mode prioritizes speed.
  • Smart Audition Cache: Audition cache is automatically isolated per engine, preventing cache hits from previous engines. The Voice Library menu now includes a "Clear Cache" option for manual cache management.
  • First-Use Guide: A 2-step onboarding flow (Welcome → Performance Test) automatically displays on first install, helping users quickly understand device performance.
  • Preset Audition Texts: The audition panel now includes 5 preset text scenarios (Daily Greeting, News Broadcast, Novel Reading, Poetry & Prose, Short Phrases), with one-tap switching and custom input support.
  • Telegram Community: Added Telegram group link (t.me/CloneTTS) to the About page.

UI & Interaction Improvements

  • Engine Description Refinement: Updated descriptions for Standard and FAST modes to help users understand the differences.
  • Enhanced Logging: Engine model (Standard/FAST) now appears in logs for easier troubleshooting.
  • Audio Preprocessing Defaults: When creating new voices, silence trimming and noise removal are disabled by default to prevent unintended reference audio modifications.
  • Performance Test Layout: Results layout reorganized with larger font for overall assessment, device info and engine mode now shown in reports for performance comparison across engines.

Accessibility

  • TalkBack Screenreader Fix: Expand/collapse arrows now dynamically announce "expand" or "collapse" based on state, eliminating duplicate announcements.

新增功能

  • 双引擎模式:新增极速引擎,可在「服务与引擎」页面切换标准/极速模式。标准模式音质更好,极速模式速度更快。
  • 试音缓存管理:切换引擎后自动使用独立缓存,不再继承旧引擎生成结果。音色管理库菜单新增「清除缓存」功能,支持手动清除试音缓存。
  • 首次使用引导:安装后自动显示 2 步引导流程(欢迎→性能测试),帮助用户快速了解设备性能。
  • 预设试听文本:试听面板新增 5 个预设文本场景(日常问候、新闻播报、小说朗读、诗词散文、短句测试),一键切换,另支持自定义输入。
  • Telegram 社区:关于页面新增 Telegram 群组链接(t.me/CloneTTS)。

交互与界面优化

  • 引擎描述优化:更新标准模式与极速模式的描述文案,帮助用户理解两种模式的差异。
  • 日志增强:日志中显示当前使用的引擎模型(标准/极速),方便排查问题。
  • 音频预处理默认值调整:新建音色时静音裁剪和底噪消除默认关闭,避免意外修改参考音频。
  • 性能测试优化:结果布局重新编排,综合评价放大字体醒目展示,设备信息和报告中显示引擎模式,方便对比不同引擎的性能差异。

无障碍

  • TalkBack 读屏修复:展开/收起箭头现在根据当前状态动态朗读"展开"或"收起",不再重复读两次。

CloneTTS v0.5.3

14 Apr 10:42

Choose a tag to compare

This release includes all cumulative updates and fixes from the v0.5.3 beta series (beta1 – beta2).

New Features

  • Built-in Reference Audio: Automatically imports 3 preset TTS voices on first install — experience voice cloning instantly without recording.
  • Short Text Pronunciation Fix: Automatically appends a period to text fragments not ending with sentence-final punctuation, reducing abrupt cutoffs at the end of short phrases.
  • Multi-select ZIP Import: The file picker for voice import now supports multi-select mode, allowing batch import of multiple ZIP files at once.
  • Selective Overwrite Import: Overwrite import now displays a voice list for the user to choose from, instead of blindly overwriting everything.
  • Import File Validation: Format validation added when creating a new voice (audio formats only); reference audio capped at 30 seconds. ZIP imports are validated for content structure, with clear error messages for invalid files.

UI & Interaction Improvements

  • Drag-to-Reorder Overhaul: Drag-and-drop sorting for voice lists and replacement rule lists has been fully rewritten, completely eliminating edge-case jitter.
  • Smoother Scrolling: Fixed an issue where the entire Voice Tab scrolled in unison.
  • Benchmark Button Optimization: Engine initialization moved to a background thread — first tap no longer causes a UI freeze.

Bug Fixes & Core Changes

  • System TTS Playback Artifact Fix: Removed redundant runtime RNNoise denoising, eliminating the audio glitch at the beginning of playback caused by GRU initialization artifacts.
  • Dual-Engine Consolidation: Voice preview and System TTS / Benchmark now share a single engine instance, saving ~150 MB of memory.
  • "Restore Original Audio" Button Fix: Added applied-state tracking so the restore button only appears after audio has actually been processed.
  • Overwrite Import Fix: Fixed an issue where importing multiple files via overwrite would overwrite each other.

该版本包含了 v0.5.3 测试系列(beta1 - beta2)的累计更新内容与修复。

新增功能

  • 内置参考音频:首次安装自动导入 3 个 TTS 预设音色,无需录音即可立即体验语音克隆效果。
  • 短文本发音优化:对未以句末标点结尾的文本片段,送入模型前自动追加句号,改善短句末尾截断或发音急促的问题。
  • 多选 ZIP 导入:音色导入文件选择器支持多选模式,一次可选多个 ZIP 文件批量导入。
  • 覆盖导入支持选择性导入:覆盖导入现在也会先展示音色列表供用户选择,而非直接全量覆盖。
  • 导入文件格式校验:新建音色时增加格式校验(仅允许音频格式),参考音频时长上限 30 秒;ZIP 包导入时校验内容结构,无效文件直接提示。

交互与界面优化

  • 拖拽排序重构:音色列表和替换规则列表的拖拽排序全面重构,彻底消除了边缘抖动问题。
  • 滚动流畅度提升:修复 Voice Tab 整页联动滚动问题。
  • 性能测试按钮优化:引擎初始化移入后台线程,首次点击不再卡顿。

Bug 与核心修复

  • 系统 TTS 朗读开头异声修复:移除运行时重复执行的 RNNoise 降噪处理,消除 GRU 初始化伪影导致的开头杂音。
  • 双引擎架构合并:音色试听与系统 TTS/性能测试共享同一引擎实例,节省约 150MB 内存占用。
  • 「恢复原始音频」按钮逻辑修复:新增已应用状态追踪,仅在音频实际经过加工后才显示恢复按钮。
  • 覆盖导入修复:修复多文件覆盖导入互相覆盖的问题。