What counts as a Unicode engine? #7

aminophen · 2019-11-13T13:44:45Z

From TeX.SX chat (after addition of \Uchar and \Ucharcat)

https://chat.stackexchange.com/transcript/message/52535726#52535726

I guess we might need to talk to them about what counts as a Unicode engine

https://chat.stackexchange.com/transcript/message/52535833#52535833

what does it leave as the Unicode status of upTeX (I'm even more confused by pTeX): as I understand it, upTeX isn't a Unicode-compliant engine

aminophen · 2019-11-13T13:46:16Z

FYI, \Uchar and \Ucharcat are described in 19f0847.

davidcarlisle · 2019-11-13T14:25:17Z

In the base latex2e code there is in most places a binary choice

The engine can load OTF fonts ("fontspec would work") and defaults to TU (Unicode) encoding
The engine is restricted to 256-character tfm fonts and defaults to OT1 ("fontspec will not work")

during format creation latex.ltx more or less arbitrarily tests \ifx\Umathchar or \ifx\Umathcode to distinguish these, not because math handling is involved, they were just convenient markers that were in xetex and luatex, we could have picked \Uchar..

Clearly the (u)ptex variants make the split between valid input character ranges and allowable font formats somewhat different from the binary split described above, which is OK, as long as we don't accidentally break it because we don't understand it:-)

In particular is it possible that uptex would need extended character codes in math and add something like \Umathcode?, if it did would the current tests in latex.ltx do the right thing or the wrong thing?

The expl3 expandable upper-lower case changing functions for example assume that \Uchar is available in "unicode engines" to generate the character, or that you can do an expandable 256 character switch (in non-unicode engines where \Uchar is not available, but 256 characters is all you need)

It may be that most of the tests are OK and will remain OK as \Uchar is added. We could add some tests, but speaking for myself it is not clear what the right thing is for many of the cases.

aminophen · 2019-11-27T13:10:07Z

@davidcarlisle Sorry for my late response:

In particular is it possible that uptex would need extended character codes in math and add something like \Umathcode?

I don't think \Umath... is possible in the future; we already have \omath... derived from Omega. Each math fonts can have only 256 characters (compared to 65536 of Omega) because they are defined by TFM not OFM. We don't have any plan to use JFM for math fonts, partly because JFM can define only 256 different widths (cf. Omega OFM can define 65536 different widths).

The expl3 expandable upper-lower case changing functions

I don't think such functions should be enabled for Japanese character code. You would not be happy to deal with the strange beasts lying in Japanese encoding ;-) Supporting only 256 characters is OK enough for us.

It may be that most of the tests are OK and will remain OK as \Uchar is added.

It seems that no package in TeX Live uses \Uchar or \Ucharcat to detect Unicode-compliant engines; all of such codes are using \Umath.... Thus, I think it's safe enough.

davidcarlisle · 2019-11-27T14:28:07Z

@aminophen thanks for the feedback

aminophen added the help wanted label Nov 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What counts as a Unicode engine? #7

What counts as a Unicode engine? #7

aminophen commented Nov 13, 2019

aminophen commented Nov 13, 2019

davidcarlisle commented Nov 13, 2019

aminophen commented Nov 27, 2019 •

edited

Loading

davidcarlisle commented Nov 27, 2019

What counts as a Unicode engine? #7

What counts as a Unicode engine? #7

Comments

aminophen commented Nov 13, 2019

aminophen commented Nov 13, 2019

davidcarlisle commented Nov 13, 2019

aminophen commented Nov 27, 2019 • edited Loading

davidcarlisle commented Nov 27, 2019

aminophen commented Nov 27, 2019 •

edited

Loading