[Romanizer] 羅馬字顯示變調 ruby by Aiuanyu · Pull Request #226 · Aiuanyu/HakSpring

Aiuanyu · 2026-04-07T01:14:16Z

This change introduces a modular sandhi engine (HakkaSandhi) and integrates it into the Romanizer tool to display tone sandhi results using <ruby> tags, matching the visual style of dictionary search results.

Key changes:

New js/sandhi.js engine supporting Dapu dialect rules with a framework for future expansion.
Romanizer tool now applies sandhi based on selected dialect and renders ruby markup.
Refactored Romanizer's UI interaction logic (uppercase toggle, deletion, hyphenation) to correctly traverse nested ruby elements.
Synchronized dictionary sandhi logic in main.js with the new centralized engine.
Consistent CSS coloring for sandhi changes in both learning and Romanizer views.

Fixes #225

PR created automatically by Jules for task 5811855358397986759 started by @Aiuanyu

- Introduced `js/sandhi.js` as a centralized tone sandhi engine for Hakka dialects. - Updated the Romanizer to display tone sandhi results using `<ruby>` tags. - Refactored `main.js` to use the new sandhi engine for dictionary search results. - Enhanced Romanizer logic to correctly handle manual capitalization and joining rules for syllables wrapped in `<ruby>` tags. - Added styling for sandhi-applied ruby elements in the Romanizer tool. - Normalized dialect name handling within the sandhi engine. Co-authored-by: Aiuanyu <93833368+Aiuanyu@users.noreply.github.com>

cloudflare-workers-and-pages · 2026-04-07T01:14:17Z

Deploying hakspring with Cloudflare Pages

Latest commit:	`b63c00a`
Status:	✅ Deploy successful!
Preview URL:	https://aaabbed9.hakspring.pages.dev
Branch Preview URL:	https://romanizer-sandhi-ruby-581185.hakspring.pages.dev

View logs

google-labs-jules · 2026-04-07T01:14:17Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

claude · 2026-04-07T01:16:50Z

Code Review — PR #226: [Romanizer] 羅馬字顯示變調 ruby

Good architectural decision to extract sandhi logic into a dedicated HakkaSandhi module. The centralization removes a lot of duplication and the IIFE pattern with a clean public API (applyToTokens, applyToHtml, hasRulesFor) is well-structured. Below are observations worth addressing before merge.

Bugs / Correctness

1. Sandhi rule divergence from old getDapuSandhiHtml (potentially intentional)

In js/sandhi.js Rule 1 (line ~32), the current-token pattern now includes circumflex vowels:

// NEW (sandhi.js)
current.match(/[àèìòùâêîôû](?![bdg])/)
// OLD (main.js getDapuSandhiHtml)
variant.match(/[àèìòù](?![bdg])/)

The comment acknowledges this ("Dapu 31 is circumflex"), so it may be intentional — but it is a silent behaviour change. If circumflex âêîôû represents tone 53 (not 31), and tone 53 has the same sandhi outcome as tone 31, adding it here is correct. Worth a short comment confirming this is deliberate.

2. Rule 2 next-syllable expansion

Old Rule 2 in main.js:

next.match(/[ǎěǐǒǔâêîôû]/)

New Rule 2 in sandhi.js:

next.match(/[ǎěǐǒǔâêîôûàèìòùáéíóú]/)

This adds grave (àèìòù) and acute (áéíóú) accented vowels as valid next-syllable triggers for the 33→35 sandhi rule. If these characters represent additional tones that correctly trigger this sandhi, the expansion is right — but it could also cause false positives. Please confirm these extra characters are intended triggers.

3. ruby.textContent setter clears children before rt is appended — order is correct but fragile

In js/romanizer.js inside populateSegmentGroup:

ruby.textContent = v.text;   // creates text node, removes any prior children
const rt = document.createElement('rt');
rt.textContent = v.newTone;
ruby.appendChild(rt);

This works because the ruby element is freshly created. However, using textContent as a setter on a parent element is a footgun — it silently removes all children. Prefer ruby.appendChild(document.createTextNode(v.text)) to make the intent explicit and safe if construction order ever changes. This pattern appears in two places in the new romanizer code.

Code Quality

4. Repeated ruby-building pattern (3x)

The same ruby construction block appears in three places: the res.variants forEach, the else if (res.newTone) branch in populateSegmentGroup, and the string template in applyToHtml. Consider extracting a small DOM helper:

function makeSandhiRuby(text, newTone, rubyClass) {
    const ruby = document.createElement('ruby');
    ruby.className = rubyClass;
    ruby.appendChild(document.createTextNode(text));
    const rt = document.createElement('rt');
    rt.textContent = newTone;
    ruby.appendChild(rt);
    return ruby;
}

5. Indentation inconsistency in main.js

The new block in applyDapuSandhiToGenerated uses 4-space indentation inside the if while surrounding code uses 2 spaces. Minor, but inconsistent with the rest of the file.

6. Dead-code guard in applyToTokens

The check currentToken.startsWith('<ruby') inside applyToTokens (sandhi.js ~line 79) is only reachable when the function is called from applyToHtml. When called from populateSegmentGroup, input tokens come from tokenizeRomanization which never produces raw HTML strings. The guard is harmless but a short comment explaining why it exists would help future readers.

Minor Notes

The typeof HakkaSandhi !== 'undefined' guard is checked at every call site. Since sandhi.js is loaded with defer before main.js and romanizer.js, the only realistic failure mode is a network load error. The guards are fine defensively, but an alternative is a no-op stub exported from sandhi.js itself so callers do not need the guard.
The window.HakkaSandhi = HakkaSandhi export at the bottom of sandhi.js works correctly for the current script-tag setup. If the project ever migrates to ES modules, swap this for a standard export.
No tests are included. The rule logic is pure (string in → object out), so unit tests for applyToTokens with known input/output pairs would be straightforward and would help catch regressions when adding new dialect rules in the future.

Overall this is a clean, well-motivated refactor. The main things to verify before merging are items 1 and 2 (rule correctness) and item 3 (the textContent setter footgun).

gemini-code-assist

Code Review

This pull request introduces a generalized Hakka tone sandhi engine in a new file js/sandhi.js and integrates it across the application. The changes refactor the romanizer to support rendering tone changes using <ruby> tags and replace the previous hardcoded Dapu sandhi logic in main.js with the new engine. Feedback focuses on reducing code duplication in js/romanizer.js by extracting helper functions for retrieving text nodes from syllable wrappers and for creating ruby elements.

gemini-code-assist · 2026-04-07T01:17:13Z

js/romanizer.js

        const textNode = Array.from(wrapper.childNodes).find(n => n.nodeType === Node.TEXT_NODE);
+        const rubyElement = wrapper.querySelector('ruby');
+
        if (textNode) {
-            const currentText = textNode.textContent;
-            if (currentText[0] === currentText[0].toUpperCase()) {
-                textNode.textContent = currentText.toLowerCase();
-            } else {
-                textNode.textContent = currentText.charAt(0).toUpperCase() + currentText.slice(1);
+            handleTextNode(textNode);
+        } else if (rubyElement) {
+            // If it's a ruby, the first child node should be the text node of the base text
+            const rubyTextNode = Array.from(rubyElement.childNodes).find(n => n.nodeType === Node.TEXT_NODE);
+            if (rubyTextNode) {
+                handleTextNode(rubyTextNode);
            }
-            // 手動調整後，毋使呼叫 applyCapitalizationRules，直接儲存狀態
-            saveState();
        }


這段用來尋找 syllable wrapper (<span> 或 <ruby>) 內文字節點的邏輯，在 applyCapitalizationRules 函式中 (分別是 handleLowering 和 handleUpper 的部分) 也有重複。

為了提高程式碼的可維護性並減少重複，建議將這段邏輯提取到一個獨立的輔助函式中。例如，可以在 romanizer.js 的頂層範圍定義一個函式：

function findSyllableTextNode(wrapper) { const textNode = Array.from(wrapper.childNodes).find(n => n.nodeType === Node.TEXT_NODE); if (textNode) return textNode; const rubyElement = wrapper.querySelector('ruby'); if (rubyElement) { // The base text of a <ruby> tag is its first child text node. return Array.from(rubyElement.childNodes).find(n => n.nodeType === Node.TEXT_NODE) || null; } return null; }

然後，這裡的程式碼就可以簡化成：

const nodeToModify = findSyllableTextNode(wrapper); if (nodeToModify) { handleTextNode(nodeToModify); }

同樣的簡化也可以應用在 applyCapitalizationRules 函式中，讓程式碼更簡潔且易於維護。

gemini-code-assist · 2026-04-07T01:17:13Z

js/romanizer.js

+                if (res.variants) {
+                    res.variants.forEach((v, idx) => {
+                        if (v.newTone) {
+                            const ruby = document.createElement('ruby');
+                            ruby.className = v.rubyClass;
+                            ruby.textContent = v.text;
+                            const rt = document.createElement('rt');
+                            rt.textContent = v.newTone;
+                            ruby.appendChild(rt);
+                            wrapper.appendChild(ruby);
+                        } else {
+                            wrapper.appendChild(document.createTextNode(v.text));
+                        }
+                        if (idx < res.variants.length - 1) {
+                            wrapper.appendChild(document.createTextNode('/'));
+                        }
+                    });
+                } else if (res.newTone) {
+                    const ruby = document.createElement('ruby');
+                    ruby.className = res.rubyClass;
+                    ruby.textContent = res.text;
+                    const rt = document.createElement('rt');
+                    rt.textContent = res.newTone;
+                    ruby.appendChild(rt);
+                    wrapper.appendChild(ruby);
+                } else {
+                    wrapper.appendChild(document.createTextNode(token));
+                }


在 if (res.variants) 區塊和 else if (res.newTone) 區塊中，創建 <ruby> 元素的程式碼是重複的。

為了讓程式碼更簡潔並易於維護，建議將這段邏輯提取到一個輔助函式中，以有效減少重複的程式碼。

const createRubyElement = (text, newTone, className) => { const ruby = document.createElement('ruby'); ruby.className = className; ruby.textContent = text; const rt = document.createElement('rt'); rt.textContent = newTone; ruby.appendChild(rt); return ruby; }; if (res.variants) { res.variants.forEach((v, idx) => { if (v.newTone) { wrapper.appendChild(createRubyElement(v.text, v.newTone, v.rubyClass)); } else { wrapper.appendChild(document.createTextNode(v.text)); } if (idx < res.variants.length - 1) { wrapper.appendChild(document.createTextNode('/')); } }); } else if (res.newTone) { wrapper.appendChild(createRubyElement(res.text, res.newTone, res.rubyClass)); } else { wrapper.appendChild(document.createTextNode(token)); }

google-labs-jules bot mentioned this pull request Apr 7, 2026

[Romanizer] 羅馬字顯示變調 ruby #225

Open

gemini-code-assist bot reviewed Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Romanizer] 羅馬字顯示變調 ruby#226

[Romanizer] 羅馬字顯示變調 ruby#226
Aiuanyu wants to merge 1 commit intomainfrom
romanizer-sandhi-ruby-5811855358397986759

Aiuanyu commented Apr 7, 2026

Uh oh!

cloudflare-workers-and-pages bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

google-labs-jules bot commented Apr 7, 2026

Uh oh!

claude bot commented Apr 7, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 7, 2026

Uh oh!

gemini-code-assist bot Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aiuanyu commented Apr 7, 2026

Uh oh!

cloudflare-workers-and-pages bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying hakspring with Cloudflare Pages

Uh oh!

google-labs-jules bot commented Apr 7, 2026

Uh oh!

claude bot commented Apr 7, 2026

Code Review — PR #226: [Romanizer] 羅馬字顯示變調 ruby

Bugs / Correctness

Code Quality

Minor Notes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cloudflare-workers-and-pages bot commented Apr 7, 2026 •

edited

Loading