-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some small issues with rpm support #796
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The doc_build
CI failure does not seem to be related to the PR. I am unsure if this PR is the cause for the examples_naming
errors.
pkg/make_rpm.py
Outdated
@@ -109,7 +109,7 @@ def CopyAndRewrite(input_file, output_file, replacements=None, template_replacem | |||
""" | |||
|
|||
with open(output_file, 'w') as output: | |||
for line in fileinput.input(input_file): | |||
for line in fileinput.input(input_file, encoding='utf-8'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a reasonable value for encoding. I had a quick look at the doc for fileinput to understand what it selected for encoding when it was unspecified. Unfortunately, it does not say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I dived into 3 python modules source code and gave up, I have the feeling it goes down all the way to native code to decide, because I'm using the relocatable python toolchain and getting different results on CentOS7 than CentOS Stream 8 and 9 for the same source code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This is probably not going to be as portable as it should be.
Can you describe the problems you are having which lead to this change.
And a test case really needs to be provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Literally was getting ASCII being picked as the default encoder/decoder even with a py3.11 toolchain, I can try coming up with a test, I gave it a few minutes of research but I couldn't figure out how to influence it by passing different env variables into python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh maybe the locale of my system is corrupted, I'll debug further tomorrow https://docs.python.org/3.11/glossary.html#term-locale-encoding sheds some light, I see rules_pkg set a few env variables to utf-8 but it's not getting the entire desired effect.
Also I had to bump rules_py through bzlmod, maybe I got in a state that the bug surfaces, what's weird is that it only happens on CO7 and not our COS8/9 container images
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aiuto so I did some research, I have kind of a repro setup, which basically has a py_binary and a genrule, all the binary does is print the locale.getlocale and compare it against UTF-8, if I run the binary with bazel run on CO7 and COS9 I get the expected UTF-8, if I go through the genrule then it works on CO7 but it fails in COS9. What I noticed is that when I print the env variables in COS9 I see LC_CLANG=C.UTF-8
as an environmental variable, if I set LC_CLANG=en_US.UTF-8 through action_env then I get the expected results.
As part of bazelbuild/bazel#7370 you added LC_CLANG=UTF-8 as a default env for make_rpm, I'll keep digging, but it looks like my image is badly configured, shall I revert this part of the change then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok more findings https://bugzilla.redhat.com/show_bug.cgi?id=902094 so COS8 added C.UTF-8 it started with CentOS Stream 8, I find other similar issues https://gitlab.haskell.org/ghc/ghc/-/issues/18607, and I see that bazel it self sets en_US.UTF-8 in the env variables for the javac calls https://github.com/bazelbuild/bazel/blob/c7a47846f12c8ad24b53cfc6dbcd99c701c293d9/src/main/java/com/google/devtools/build/lib/rules/java/JavaCompileActionBuilder.java#L57, does it make sense that I create a patch that sets the right env variable instead? Not sure how we would test this though, it requires a customized container image, otherwise I just drop the commit and set action_env in my .bazelrc, this only affects very old systems anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aiuto I changed the way to deal with this, I figured out that rules_pkg doesn't respect LC_CLANG from action_env, I made some changes to respect this value and only if not passed by the rule invocation to use the defaults.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply.
The current change feels both better and worse.
- On the plus side, it respects default_shell_env
- On the minus side, default_shell_env is a mess of underdocumentation and potential churn with --incompatible_merge_fixed_and_default_shell_env
The most robust situation would be if we can do things in the python code that we can completely ignore the environment, and do the right thing no matter what OS or python version we have. Unfortunately, I don't have enough of a build farm to test all the combinations, so I don't know how to win.
Going back to your original change ( for line in fileinput.input(input_file, encoding='utf-8') ), what was wrong with that? It seems like the right thing. Chuck asked for a little more research, and you dug deeper and found the crawly things under the rock. Could we do the limited change of fixing make_rpm and get your builds to work?
pkg/make_rpm.py
Outdated
@@ -109,7 +109,7 @@ def CopyAndRewrite(input_file, output_file, replacements=None, template_replacem | |||
""" | |||
|
|||
with open(output_file, 'w') as output: | |||
for line in fileinput.input(input_file): | |||
for line in fileinput.input(input_file, encoding='utf-8'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This is probably not going to be as portable as it should be.
Can you describe the problems you are having which lead to this change.
And a test case really needs to be provided.
When passing changelog as an argument the OutputGroupInfo expects File and not Target, so it needs to be retrieved from the file member of the context
If --action_env is being passed then the repo expects it to hit all the actions, by forcing LANG/LC_CTYPE to override we're removing that use case
Friendly ping. Sorry for the slow ping. I was essentially away all of January and February. |
I started trying to update rules_pkg to build some of our RPMs at booking, and noticed a few issues.
This are small changes, and even though they're not related, I didn't thought it was a good idea to
have them in separate MRs