Improving the Go backend
There are currently several major issues with our gomod backend that need to be addressed (see below for more detailed explanation). This milestone is supposed to track any outstanding Go refactoring work to get us to the desired state discussed below:
- not allowing users to point us to a go.work file in case they make use of workspaces
- toolchain selectio…
There are currently several major issues with our gomod backend that need to be addressed (see below for more detailed explanation). This milestone is supposed to track any outstanding Go refactoring work to get us to the desired state discussed below:
- not allowing users to point us to a go.work file in case they make use of workspaces
- toolchain selection between pre-installed (legacy) 1.20 and 1.21.0 isn't currently easily extensible
- the backend is the only one we have which is actually allowed to dirty users' source repository
Not allowing users to point us to a go.work file in case they make use of workspaces
Even though we support Go's workspaces feature, we still require users to point us to their "main" module's go.mod file as the original architectural design works with the premise of "the main package" which we need to report in the resulting SBOM. This concept and expectation of ours crumbles with Go's workspaces where all application modules are nicely tucked under a workspace envelope making it hard to guess which module is the "main" one since Go doesn't require users to use a top-level go.mod file if a go.work is present. We need to keep up with the ecosystem trends and allow our users comfortably make use of the workspaces feature by being able to consume a path to their go.work file instead of go.mod and hence being able to derive the name of the "main application package" from that information to satisfy our SBOM generator.
Toolchain selection between pre-installed (legacy) 1.20 and 1.21.0 isn't currently easily extensible
When Go 1.21 was released it changed the game significantly because it started enforcing the 'go X.Y.Z' line in one's go.mod file as the minimum required version to process/build a project. That lead to some undesirable side effects of newer Go toolchains dirtying users source repositories that still used old Go releases with upgrade-related cruft. The bandaid we applied at the time was to start supporting 2 Go toolchain streams - one for legacy projects and one for 1.21+ based projects. However, since we also tried to support local host execution (developer machines) of our tool the logic behind the correct toolchain version selection wasn't straightforward due to not only minor but also micro release checks. We build it around the 'GOTOOLCHAIN=auto' variable which let us to download any user desired toolchain on demand during prefetch automatically. Therefore, with the 1.21+ toolchains we are always fixed a 0th release of a particular language release, e.g. 1.21.0. The problem however is that the code is somewhat buggy and hard to extend to the extent where a bump to e.g. 1.22.0 or 1.23.0 wasn't simple (note that we do support both of those releases, it's just that we don't have those toolchain versions pre-installed in the shipped image).
It is time for us to get rid of the legacy toolchain (EOL in upstream) and only process input repositories with a single one. Ideally, we'd also enable dependabot micro version updates to the installed toolchains so that our usage of and dependency on 'GOTOOLCHAIN' is decreased. This can be comfortably achieved ONLY with the next item on list addressed!
The backend is the only one we have which is actually allowed to dirty users' source repository
Historically, our go backend supported this questionable functionality backed by the '--force-gomod-tidy' CLI flag which allowed projects to rely on our tool to do package cleanup/hash updates for them instead of adopting the good practices directly at the project repository in their own maintained CI. This led to the backend being the only one "allowed" to dirty users input repositories in certain cases. In general, such behaviour (i.e. touching the sources directly) is deemed bad practice and should be avoided at all costs (unfortunately, can't be avoided 100% due to some technological shortcomings) and in the case of our go backend, we're able to revert this behaviour by deprecating the usage of the aforementioned flag and forcing the backend to process inputs in temporary working copies.
NOTE THAT when addressing issues in this tracker, the order in which they're addressed matters because when it comes to refactoring a single refactor done right can address multiple issues at the same time, for example:
- it makes no sense to make us to try addressing any toolchain related logging before toolchain selection is fixed because that change alone will result in significant changes to the code base, which will most likely lead to the problem fixing itself automagically
- it makes no sense trying to fix toolchain selection without first isolating all the backend operations to a temporary working copy because the problem of new Go dirtying repositories fixed at old Go releases won't disappear, but it'll be contained within the temporary working copy which will be discarded at the end of prefetch, leaving users repositories in a pristine condition.