Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pkg_rpm fails to build packages targeted to platforms other than the execution host machine. #727

Open
ogalbxela opened this issue Aug 13, 2023 · 5 comments
Labels
P3 An issue that we are not working on but will review quarterly rpm

Comments

@ogalbxela
Copy link

ogalbxela commented Aug 13, 2023

The current version of the "pkg_rpm" rule exposes an "architecture"-argument, which is meant to indicate the package's intended platform. It is being used to define the "BuildArch" option in the generated RPM spec file.
However, as per RPM packaging guide (https://rpm-packaging-guide.github.io/):
"'BuildArch' should be used if the package is not architecture dependent. For example, if written entirely in an interpreted programming language, set this to BuildArch: noarch. If not set, the package automatically inherits the Architecture of the machine on which it is built".

It appears that BuildArch is intended to represent either architecture of execution host machine or "noarch". Any efforts to designate an "alien" platform as a value for this field lead to a failure in the "rpmbuild"-utility, accompanied by an error message stating: "error: No compatible architectures found for build."

Possible solution to address this concern is to pass the value of "architecture"-argument to the "rpmbuild"-utility as the value for the "--target" option. However, it may break backward compatibility as "architecture"-argument might already be in use by someone who relies on current behavior. Introducing brand new "target_architecture"-argument may also be an option to consider.

Please consider possible fix #729. Please note that proposed patch attempts to address back compatibility concern.

Below is a "synthetic" example aimed to demonstrate concern. Note: in order to reduce sample size, I omitted cross compilation part just providing binary pre-compiled for aarch64.

Versions are: bazel: 6.0.0; rules_pkg: 0.9.1; host: ubuntu 20.04.2 x86_64; target: linux-aarch64

$ uname -a
Linux alexbl-dev-home 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ 
$ 
$ tree
.
├── src
│   ├── binary-aarch64
│   └── BUILD
└── WORKSPACE
$ 
$ 
$ file src/binary-aarch64 
src/binary-aarch64: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, not stripped
$ 
$ 
$ rpmbuild --version
RPM version 4.14.2.1
$ 
$ 
$ cat WORKSPACE
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
    name = "rules_pkg",
    sha256 = "8f9ee2dc10c1ae514ee599a8b42ed99fa262b757058f65ad3c384289ff70c4b8",
    urls = [
        "https://mirror.bazel.build/github.com/bazelbuild/rules_pkg/releases/download/0.9.1/rules_pkg-0.9.1.tar.gz",
        "https://github.com/bazelbuild/rules_pkg/releases/download/0.9.1/rules_pkg-0.9.1.tar.gz",
    ],
)
load("@rules_pkg//:deps.bzl", "rules_pkg_dependencies")
rules_pkg_dependencies()
load("@rules_pkg//toolchains/rpm:rpmbuild_configure.bzl", "find_system_rpmbuild")
find_system_rpmbuild(name = "rules_pkg_rpmbuild")
$ 
$ 
$ cat src/BUILD 
load("@rules_pkg//pkg:mappings.bzl", "pkg_attributes", "pkg_filegroup", "pkg_files")
load("@rules_pkg//pkg:rpm.bzl", "pkg_rpm")

pkg_files(
    name = "sample-files",
    srcs = [":binary-aarch64"],
)

pkg_filegroup(
    name = "sample-data",
    srcs = [":sample-files"],
)

pkg_rpm(
    name = "sample-rpm",
    srcs = [":sample-data"],
    architecture = "aarch64",
    description = "Sample description",
    license = "Apache 2.0",
    release = "1",
    summary = "Sample summary",
    version = "1.0.0",
)
$ 
$ 
$ bazel build //src:sample-rpm 
INFO: Analyzed target //src:sample-rpm (46 packages loaded, 319 targets configured).
INFO: Found 1 target...
INFO: From MakeRpm src/sample-rpm-1.0.0-1.aarch64.rpm:
Error calling rpmbuild:
error: No compatible architectures found for build

No RPM file created.
ERROR: /home/dev/Projects/bazel/rpm_pkg_cross_demo/src/BUILD:14:8: output 'src/sample-rpm-1.0.0-1.aarch64.rpm' was not created
ERROR: /home/dev/Projects/bazel/rpm_pkg_cross_demo/src/BUILD:14:8: MakeRpm src/sample-rpm-1.0.0-1.aarch64.rpm failed: not all outputs were created or valid
Target //src:sample-rpm failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.913s, Critical Path: 0.15s
INFO: 11 processes: 10 internal, 1 linux-sandbox.
FAILED: Build did NOT complete successfully

sample.zip

@nacl
Copy link
Collaborator

nacl commented Aug 14, 2023

Like #661, but this has some more detail.

IIRC, the issue applies not to the host (where Bazel itself is running), but to the exec platform (where Bazel is executing commands). They are often the same, but can be different in the case of remote build execution.

@nacl nacl added P3 An issue that we are not working on but will review quarterly rpm labels Aug 14, 2023
@aiuto
Copy link
Collaborator

aiuto commented Aug 14, 2023

ISTM that we need to think about this in a full remote build situation to reason about the right behavior.

  • from a windows machine, I should be able to kick off an rpm build
  • which remotely executes on linux x86
  • and targets (via --platforms or --cpu) linux arm

As part of that, we need to get architecture out of the pkg_rpm rule. That should be picked up from constraints on the target build platform. To me, looking at the example above running on x86

bazel build //src:sample-rpm 

Should indeed fail (but maybe with a different error), because the target platform and the architecture field do not agree. It should be more like

bazel build --platforms=//my/platforms:linux_aarch64 //src:sample-rpm 

Or... an entirely different way to approach this is to allow pkg_rpm to transition to a new target platform. We might replace architecture with platform, and expect the chosen platform to add a CPU constraint which can be mapped back into one of the common architectures. That scales a little differently.

There is a case for having both ways of doing it. It's very useful for some people to have distinct targets (and thus distinct target names) for x86 and arm packages, so they can do a bazel build //my/rpms/... and get both as outputs. Other people may have generic integration test suites and they want separate build/test/release runs where they specify a different --cpu on the command line for each.

@ogalbxela What does your company tend to do for release pushing?

@ogalbxela
Copy link
Author

ogalbxela commented Aug 14, 2023

@aiuto Thank you very much for sharing your thoughts on this. I agree we need think more on right approach.

As I mentioned, I omitted cross compilation part in the sample to simplify it.
Sample is aimed to demonstrate that setting "architecture"-option to something other than execution host has, breaks the build.

In fact, for our use case we compile it as:

bazel build --platforms=@custom_platforms//:linux-x86_64 //pkg:sample-rpm
bazel build --platforms=@custom_platforms//:linux-aarch64 //pkg:sample-rpm
bazel build --platforms=@custom_platforms//:linux-ppc64 //pkg:sample-rpm
bazel build --platforms=@custom_platforms//:linux-ppc64le //pkg:sample-rpm

And we intend to call rpm_pkg as:

pkg_rpm(
    name = "sample-rpm",
    srcs = [
        ":sample-data",
    ],
    description = "Description",
    license = "Apache 2.0",
    release = BUILD_NUMBER,
    summary = "Summary",
    architecture = select(
        {
            "@custom_platforms//:linux-aarch64": "aarch64",
            "@custom_platforms//:linux-ppc64": "ppc64",
            "@custom_platforms//:linux-ppc64le": "ppc64le",
            "@custom_platforms//:linux-x86_64": "x86_64",
            "//conditions:default": "UNSUPPORTED",
        },
    ),
    target_compatible_with = select({
        "@custom_platforms//:linux-all": [],
        "//conditions:default": ["@platforms//:incompatible"],
    }),
    version = VERSION,
)

which would make the issue to show up.

I agree that we can simply refer to target achitecture as the "target".
In the proposal I submitted (#729), I made an effort to refrain from making changes to the "architecture" option, as modifying it could potentially disrupt backward compatibility. This is why I introduced the "target_architecture".
Are you concerned that this might cause issues with remote builds? My proposal basically propagates "target_architecture" to the "rpmbuild"-utility through the "--target"-argument. Not sure if it may affect remote build. Can double check on this.

If you have suggestions for a more effective approach to addressing these concerns, please don't hesitate to share. I'm open to considering alternative solutions.

As for the proposal I've submitted, it aligns well with the requirements of my project up to this point. However, if we come across a superior solution that could also benefit the wider community, I'm more than willing to incorporate it into my implementation.

@ogalbxela ogalbxela changed the title Pkg_rpm fails to build packages targeted to platforms other than the host machine. Pkg_rpm fails to build packages targeted to platforms other than the execution host machine. Aug 17, 2023
@ogalbxela
Copy link
Author

Like #661, but this has some more detail.

IIRC, the issue applies not to the host (where Bazel itself is running), but to the exec platform (where Bazel is executing commands). They are often the same, but can be different in the case of remote build execution.

Yes, when I mentioned 'host,' I was referring to the machine that is used to run the 'rpmbuild'. Thank you for paying attention to this. I have made adjustments to the title and description to make them more accurate.

@aiuto
Copy link
Collaborator

aiuto commented Sep 13, 2023

Let's presume your sample use case is good enough. I would like it to be easier to auto-set what arch it is automatically from the platform constraints, but that can happen later.

I'm comparing it to the code in the PR and wondering about architecture vs target_architecture. You set architecture here but the code seems to be using target_arch for that and architecutre for the build arch. I'm wondering if the PR is just lagging your thinking above or am misunderstanding.

But this is all confusing:
From https://rpm-packaging-guide.github.io/ we have

BuildArch | If the package is not architecture dependent, for example, if written entirely in an interpreted programming language, set this to BuildArch: noarch. If not set, the package automatically inherits the Architecture of the machine on which it is built, for example x86_64.

The first sentence is fine. Add the second, and no other words, and I read that to mean there is no notion of cross compilation in rpmbuild. So, my early thoughts about cross compilation and RBE reduce to,

  • target platform maps to BuildArch
  • you should be able to set BuildArch=noarch
  • should exec platform CPU default BuildArch or not? That is the only question.

Let's continue in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 An issue that we are not working on but will review quarterly rpm
Projects
None yet
Development

No branches or pull requests

3 participants