The extended HTML markup does not add spans to tokens which contain certain characters, such as İ.
It is mentioned in the code (here) that this is because they have a different length in lower and upper case.
So far, I have only found this to affect the character İ. This character does appear frequently in Turkish wikipedia.
The extended HTML markup does not add spans to tokens which contain certain characters, such as İ.
It is mentioned in the code (here) that this is because they have a different length in lower and upper case.
So far, I have only found this to affect the character İ. This character does appear frequently in Turkish wikipedia.