Conversation
- Introduced `js/sandhi.js` as a centralized tone sandhi engine for Hakka dialects. - Updated the Romanizer to display tone sandhi results using `<ruby>` tags. - Refactored `main.js` to use the new sandhi engine for dictionary search results. - Enhanced Romanizer logic to correctly handle manual capitalization and joining rules for syllables wrapped in `<ruby>` tags. - Added styling for sandhi-applied ruby elements in the Romanizer tool. - Normalized dialect name handling within the sandhi engine. Co-authored-by: Aiuanyu <93833368+Aiuanyu@users.noreply.github.com>
Deploying hakspring with
|
| Latest commit: |
b63c00a
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://aaabbed9.hakspring.pages.dev |
| Branch Preview URL: | https://romanizer-sandhi-ruby-581185.hakspring.pages.dev |
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Code Review — PR #226: [Romanizer] 羅馬字顯示變調 rubyGood architectural decision to extract sandhi logic into a dedicated Bugs / Correctness1. Sandhi rule divergence from old In // NEW (sandhi.js)
current.match(/[àèìòùâêîôû](?![bdg])/)
// OLD (main.js getDapuSandhiHtml)
variant.match(/[àèìòù](?![bdg])/)The comment acknowledges this ("Dapu 31 is circumflex"), so it may be intentional — but it is a silent behaviour change. If circumflex 2. Rule 2 next-syllable expansion Old Rule 2 in next.match(/[ǎěǐǒǔâêîôû]/)New Rule 2 in next.match(/[ǎěǐǒǔâêîôûàèìòùáéíóú]/)This adds grave ( 3. In ruby.textContent = v.text; // creates text node, removes any prior children
const rt = document.createElement('rt');
rt.textContent = v.newTone;
ruby.appendChild(rt);This works because the ruby element is freshly created. However, using Code Quality4. Repeated ruby-building pattern (3x) The same ruby construction block appears in three places: the function makeSandhiRuby(text, newTone, rubyClass) {
const ruby = document.createElement('ruby');
ruby.className = rubyClass;
ruby.appendChild(document.createTextNode(text));
const rt = document.createElement('rt');
rt.textContent = newTone;
ruby.appendChild(rt);
return ruby;
}5. Indentation inconsistency in The new block in 6. Dead-code guard in The check Minor Notes
Overall this is a clean, well-motivated refactor. The main things to verify before merging are items 1 and 2 (rule correctness) and item 3 (the |
There was a problem hiding this comment.
Code Review
This pull request introduces a generalized Hakka tone sandhi engine in a new file js/sandhi.js and integrates it across the application. The changes refactor the romanizer to support rendering tone changes using <ruby> tags and replace the previous hardcoded Dapu sandhi logic in main.js with the new engine. Feedback focuses on reducing code duplication in js/romanizer.js by extracting helper functions for retrieving text nodes from syllable wrappers and for creating ruby elements.
| const textNode = Array.from(wrapper.childNodes).find(n => n.nodeType === Node.TEXT_NODE); | ||
| const rubyElement = wrapper.querySelector('ruby'); | ||
|
|
||
| if (textNode) { | ||
| const currentText = textNode.textContent; | ||
| if (currentText[0] === currentText[0].toUpperCase()) { | ||
| textNode.textContent = currentText.toLowerCase(); | ||
| } else { | ||
| textNode.textContent = currentText.charAt(0).toUpperCase() + currentText.slice(1); | ||
| handleTextNode(textNode); | ||
| } else if (rubyElement) { | ||
| // If it's a ruby, the first child node should be the text node of the base text | ||
| const rubyTextNode = Array.from(rubyElement.childNodes).find(n => n.nodeType === Node.TEXT_NODE); | ||
| if (rubyTextNode) { | ||
| handleTextNode(rubyTextNode); | ||
| } | ||
| // 手動調整後,毋使呼叫 applyCapitalizationRules,直接儲存狀態 | ||
| saveState(); | ||
| } |
There was a problem hiding this comment.
這段用來尋找 syllable wrapper (<span> 或 <ruby>) 內文字節點的邏輯,在 applyCapitalizationRules 函式中 (分別是 handleLowering 和 handleUpper 的部分) 也有重複。
為了提高程式碼的可維護性並減少重複,建議將這段邏輯提取到一個獨立的輔助函式中。例如,可以在 romanizer.js 的頂層範圍定義一個函式:
function findSyllableTextNode(wrapper) {
const textNode = Array.from(wrapper.childNodes).find(n => n.nodeType === Node.TEXT_NODE);
if (textNode) return textNode;
const rubyElement = wrapper.querySelector('ruby');
if (rubyElement) {
// The base text of a <ruby> tag is its first child text node.
return Array.from(rubyElement.childNodes).find(n => n.nodeType === Node.TEXT_NODE) || null;
}
return null;
}然後,這裡的程式碼就可以簡化成:
const nodeToModify = findSyllableTextNode(wrapper);
if (nodeToModify) {
handleTextNode(nodeToModify);
}同樣的簡化也可以應用在 applyCapitalizationRules 函式中,讓程式碼更簡潔且易於維護。
| if (res.variants) { | ||
| res.variants.forEach((v, idx) => { | ||
| if (v.newTone) { | ||
| const ruby = document.createElement('ruby'); | ||
| ruby.className = v.rubyClass; | ||
| ruby.textContent = v.text; | ||
| const rt = document.createElement('rt'); | ||
| rt.textContent = v.newTone; | ||
| ruby.appendChild(rt); | ||
| wrapper.appendChild(ruby); | ||
| } else { | ||
| wrapper.appendChild(document.createTextNode(v.text)); | ||
| } | ||
| if (idx < res.variants.length - 1) { | ||
| wrapper.appendChild(document.createTextNode('/')); | ||
| } | ||
| }); | ||
| } else if (res.newTone) { | ||
| const ruby = document.createElement('ruby'); | ||
| ruby.className = res.rubyClass; | ||
| ruby.textContent = res.text; | ||
| const rt = document.createElement('rt'); | ||
| rt.textContent = res.newTone; | ||
| ruby.appendChild(rt); | ||
| wrapper.appendChild(ruby); | ||
| } else { | ||
| wrapper.appendChild(document.createTextNode(token)); | ||
| } |
There was a problem hiding this comment.
在 if (res.variants) 區塊和 else if (res.newTone) 區塊中,創建 <ruby> 元素的程式碼是重複的。
為了讓程式碼更簡潔並易於維護,建議將這段邏輯提取到一個輔助函式中,以有效減少重複的程式碼。
const createRubyElement = (text, newTone, className) => {
const ruby = document.createElement('ruby');
ruby.className = className;
ruby.textContent = text;
const rt = document.createElement('rt');
rt.textContent = newTone;
ruby.appendChild(rt);
return ruby;
};
if (res.variants) {
res.variants.forEach((v, idx) => {
if (v.newTone) {
wrapper.appendChild(createRubyElement(v.text, v.newTone, v.rubyClass));
} else {
wrapper.appendChild(document.createTextNode(v.text));
}
if (idx < res.variants.length - 1) {
wrapper.appendChild(document.createTextNode('/'));
}
});
} else if (res.newTone) {
wrapper.appendChild(createRubyElement(res.text, res.newTone, res.rubyClass));
} else {
wrapper.appendChild(document.createTextNode(token));
}
This change introduces a modular sandhi engine (
HakkaSandhi) and integrates it into the Romanizer tool to display tone sandhi results using<ruby>tags, matching the visual style of dictionary search results.Key changes:
js/sandhi.jsengine supporting Dapu dialect rules with a framework for future expansion.main.jswith the new centralized engine.Fixes #225
PR created automatically by Jules for task 5811855358397986759 started by @Aiuanyu