Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_real in cff_dict.c fails on some locales #18

Open
simoncozens opened this issue Apr 16, 2019 · 5 comments
Open

get_real in cff_dict.c fails on some locales #18

simoncozens opened this issue Apr 16, 2019 · 5 comments

Comments

@simoncozens
Copy link

See sile-typesetter/libtexpdf#8 and sile-typesetter/sile#573.

cff_dict.c parses real numbers using the strtod library function. However, when the locale is set to one where LC_NUMERIC sets the decimal point to something other than ., get_real fails because it is expecting the decimal point to be the period character.

We are planning to fix this in libtexpdf by providing a locale-independent strtod, but you will probably also want to fix this here too.

@kberry
Copy link
Collaborator

kberry commented Mar 18, 2024

Hi Simon - thanks for this report from five years ago, which I'm afraid I didn't see until now. Are you still there? What did you end up doing in sile-typesetter?

I fear this also affects LuaTeX, where I think the non-thread-safe solution in the last answer on https://stackoverflow.com/questions/1994658 (one of the linked questions) would not be viable. Which is otherwise the approach that occurs to me. --thanks, karl.

@simoncozens
Copy link
Author

Hi Karl. I don't think we fixed it yet, unfortunately.

@kberry
Copy link
Collaborator

kberry commented Mar 19, 2024

In https://stackoverflow.com/questions/41794607 I see the suggestion to use the *_l functions which are said to be in BSD/GNU/Linux and with equivalents on Windows. Using them if available and falling back to the locale-aware strtod seems like a decent, if not perfect, workaround to me. Especially since the original bug is triggered (very) rarely.

I'm guessing none of us want to reimplement the actual parsing (I sure don't). Temporarily setting the locale to C has the thread problem already mentioned. I'm not seeing a better alternative. wdyt?

@ebraminio
Copy link

ebraminio commented Sep 6, 2024

In harfbuzz after lots of playing around those locale specific libc functions, we went with having the parser https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-number-parser.rl specially as we actually had the infrastructure to ease that up, Ragel, to generate the actual C source, https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-number-parser.hh maybe that changes in the future but I actually believe it's better to not rely on such a breakable thing for that low level thing like font parsing but I guess your approach on trying locale specific functions also can be good.

@kberry
Copy link
Collaborator

kberry commented Sep 6, 2024

Thanks for the note and code pointers. I agree it would be better in principle to have "native" parsing than relying on libc fns (I'm glad you implemented that for harfbuzz, which is a far more important case :). It might be more or less the same amount of work to reuse libc vs. implementing from scratch, depending on exactly what has to be parsed for xetex. Maybe someday I will have a chance to look into it, or someone else will come across this bug and be inspired to write a patch ... hoping ... --thanks again, karl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants