Thai Text problems

I’m currently testing out some text rendering function, where I ran into two Thai-specific problems.

  1. The editor doesn’t seems to support Thai out of the box. Some kind of Tofu (with ท inside) displays in the text field. Is there a way to change the UI font so it could accommodate Thai text in the input field of the editor?

  1. The text control in UMG doesn’t seems to support required shaping feature required to render Thai text properly. The missing ones might be mark-to-mark and mark-to-base positioning (probably only one of them are missing, not entirely sure about it). I’ve tested all 3 available value in the Localization’s Text Shaping Method settings and all of them doesn’t really work. (the font I used is Noto Sans Thai UI)

274082-2019-04-13-20-23-54-hudbp.png

notice the tone marks in the highlighted area, they are supposed to be placed a bit higher.

The question is, what shaping features are support in UE, whether or not mkmk and mark feature supported by the engine, and if they are missing, are there any plan to bring it it?

My guess is the work around would be to do manual glyph substitution (which is very common, and probably faster than mark-to-mark/mark-to-base implementation) instead, although it will give a different flavor of headaches I think.

To answer my own question:

  1. I couldn’t find how to change the editor’s font, or the fallback font rather. However the font file are DroidSansFallback.ttf in the engine’s content folder. Replace it with another font that works with the engine will make it work.

  2. Seems like the font file I used (NotoSansThai) does not work with UnrealEngine for some reason. I changed to Droid SansThai and the issue is gone. I guess the font I used before uses some feature that Harfbuzz does not recognized (it’s a rather old version of Harfbuzz afterall…)

In the end I use DroidSansThai fonts for both place and it works pretty ok.

The case ‘ที่’/‘ปี่’ is quite interesting. It only appears in this specific fonts, so I think this fonts work differently from the other.

First of all, based from the debugging data, the text shaping (including glyph substitution) turns out to be perfect. If the glyphs are rendered as if, there would be no problem. However, it looks like UE4 also perform additional processing on top of that. That is something I have to dig deeper.

My guess is, the mark-to-mark and mark-to-base processing is performed, but somehow it attach the mark “mai eak” ( ่ ) to the base character “tor taharn” (ท) while it has another mark, “sara ee”, attach to it already. I expect that ‘mai eak’ mark should attach to the ‘sara ee’ mark instead of the base ‘tor taharn’.

I’m not an expert in this subject, so I guess I need some text rendering experts to help. I’ll first let the font foundry (and Google) know the issue first.